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PREFACE 


An  international  conference  entitled  "Simulation  of  Adaptive  Behavior:  From  Animals  to 
Animats"  took  place  in  Paris  on  September  24-28, 1990.  The  object  of  the  confereiKre  was  to  bring 
together  resear^ers  in  ethology,  ecology,  cybernetics,  artificial  intelligence,  robotics,  and  related 
fields  so  as  to  further  our  understanding  of  the  behaviors  and  underlying  mechanisms  that  allow 
animals  and,  potentially,  robots  to  adapt  and  survive  in  uncertain  environments. 

"SAB90",  as  we  called  it,  was  the  first  major  conference  to  test  the  hypothesis  that  jjeople  in¬ 
terested  in  understanding  animal  behavior,  and  people  interested  in  simulating  or  constructing 
autonomous  robots,  would  have  important  common  interests  and  would  welrome  the  chance  to 
listen  to  and  learn  from  each  other.  The  conference  further  tested  the  somewhat  more  radical  hy¬ 
pothesis  that  its  focus  constituted  not  only  an  intersection  but  a  growing  new  field  concerned,  in 
both  animals  and  "aiumats",  with  adaptive  behavior. 

By  a  variety  of  measures  including  size  and  international  range  of  attendance,  intellectual 
enthusiasm,  quality  and  diversity  of  contributions,  and  degree  of  interaction  among  the  partici¬ 
pants,  SAB90  offered  strong  support  for  both  these  hypotheses.  Furthermore,  the  emergence  of 
a  field  in  its  own  right  was  sigruled  by  the  fact  that  while  there  was  lively  debate  along  many 
axes — e.g.,  top-down  vs.  bottom-up,  leanung  vs.  reflexes,  hierarchical  vs.  flat,  simulate  vs.  build, 
to  mention  a  few — it  was  striking  how  people  everywhere  along  the  animals  to  animats  axis  were 
thinking  about  similar  sets  of  problems. 

These  proceedings  contain  62  papers,  59  that  were  actually  presented  at  the  conference,  plus 
three  whose  authors  could  not  attend.  The  book  is  divided  into  sections  corresponding  to  the 
conference  sessions.  In  each  section,  papers  presented  as  talks  are  followed  by  related  papers  that 
were  presented  as  posters. 

The  first  section.  The  Animat  Approach,  contains  papers  on  artificial  animal  research  as  a  tool 
for  understarKling  adaptive  behavior  and,  indeed,  as  a  new  approach  to  artificial  intelligence. 
The  next  sections — Perception  and  Motor  Control,  Cognitive  Maps  and  Internal  World  Models, 
Motivation  and  Emotion,  Action  Selection  and  Behavioral  Sequences,  Ontology  and  Learning 
Collective  Behaviors,  and  Evolution  of  Behavior — contain  pajpers  on  these  themes  from  both  the 
animal  and  animat  perspectives.  There  follows  a  large  section  on  Architectures,  Organizational 
Principles,  and  Functional  Approaches,  containing  several  strong — and  differing — theses  on 
how  to  understand  or  achieve  itatural  or  artificial  systems  with  adaptive  behavior.  The  book 
concludes  with  a  two-paper  section.  Animats  in  Education,  that  describes  novel  and  uncompli¬ 
cated  robot  and  simulation  technologies  designed  for  teaching  and  research. 

SAB90  could  not  have  taken  place  without  the  assistance  of  many  people  and  organizations. 
We  are  especially  grateful  to  members  of  the  Program  Coiiunittee,  whose  conscientious  review¬ 
ing  selected  the  papers  here  from  the  more  than  90  submitted,  and  who  ably  chaired  the  confer¬ 
ence  sessions.  TTie  Comrruttee  members  were 

Lashon  Booker,  MITRE  Corporation,  USA 
Rodney  Brooks,  MIT  Artificial  Intelligence  Lab,  USA 
Patrick  Colgan,  Queen's  University  at  Kingston,  Canada 
Patrick  Greussay,  University  Paris  Vin,  France 
David  McFarland,  Balliol  College,  Oxford,  UK 
Luc  Steels,  VUB  A1  Lab,  Belgium 
Richard  Sutton,  GTE  Laboratories,  USA 
Frederick  Tnates,  The  Open  University,  UK 

David  Waltz,  Thinking  Machines  Corp.  and  Brarxleis  University,  USA 
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We  thank  each  of  the  following  sponsors  of  the  conference,  and  mention  particularly  AFOSR 
which  enabled  us  to  provide  substantial  needed  travel  assistance,  and  the  anonymous  Corporate 
Donor  whose  early  confideiKe  in  our  project  was  a  great  boost. 
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worked  long  and  hard  to  make  the  conference  a  success.  We  thank  Anne  Brelet,  Eric  Granjeon, 
Agnfes  Guillot,  Jean-Louis  Pennetier,  Philippe  Tarroux,  and  Pierre  Vincens. 

We  wish  to  express  our  gratitude  to  the  Ministere  de  la  Recherche  et  de  la  Technologie  for 
having  generously  placed  at  our  disposal  the  Amphitheatre  Poincare  in  which  the  conference 
sessions  were  held.  We  also  express  our  particular  thanks  to  Josiane  Sene  (Administrator  of  the 
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We  invite  readers  to  enjoy  and  profit  from  the  papers  in  this  book,  and  look  forward  to  the 
next  SAB  conference! 


Jean-Arcady  Meyer  and  Stewart  W.  Wilson 
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Abstract 

This  paper  first  outlines  the  methodology  of 
schema  theory  (Arbib  1981),  which  integrates 
perception  and  action  by  decomposing  an 
overall  behavior  into  the  interaction  of 
functional,  neurally  explicable,  units  called 
schemas .  It  offers  comparisons  with  other 
methodologies  from  Artificial  Intelligence  (AI) 
and  Brain  Theory  (BT),  and  reviews  the  RS 
(Robot  Schema)  language  and  the  Arbib  and 
House  (1987)  model  of  detour  behavior  in  Rana 
computatrix  which  associates  potential  fields 
with  objects  —  an  attractant  for  the  prey;  a  re- 
pulsor  for  the  fencepost;  and  a  forward  field 
for  the  toad  itself  •  which  are  then  combined  to 
create  more  complex  fields  which  determine 
the  trajectory  of  the  animal. 

However,  rather  than  analyze  detour  behavior 
here,  the  remainder  of  the  paper  presents  a 
schema-theoretic  model  for  the  decision¬ 
making  mechanisms  which  control  prey- 
catching  behavior  in  frog  and  toad;  the  exten¬ 
sion  of  the  work  to  model  predator-avoidance 
is  discussed  elsewhere.  It  thus  contributes  to 
Rana  computatrix,  an  evolving  set  of  models  of 
anuran  visuomotor  coordination  (e.g..  Arbib 
1987).  Our  new  model  of  prey-catching  is 
rooted  in  recent  experimental  data  on  the 
behavior  of  animals  with  and  without  brain 
lesions.  These  data  motivate  the  model's  use  of 
independent  processing  of  the  different 

^  The  research  described  in  this  paper  was  supported 
in  part  by  grant  no.  IROl  NS  24926  from  the  National 
Institutes  of  Health  (M.A.Arbib,  Principal  Investiga¬ 
tor)  and  Fulbright/MEC  fellowship  FU88-350I1116 
(Spain)  to  A.C 

^  Present  address:  Center  for  the  Neurobioiogy  of 
Learning  and  Memory,  University  of  California  at 
Irvine,  Irvine,  CA  92717,  USA. 


parameters  that  define  the  stimulus  position 
(horizontal  eccentricity,  elevation  and  dis¬ 
tance).  The  model,  which  emphasizes  action 
generated  by  the  concurrent  activity  of 
multiple  motor  schemas  rather  than  the  serial 
activity  of  such  schemas,  predict  new  behav¬ 
iors  for  experimental  test. 

1.  An  Introduction  to  Schema  Theory 

Schema  theory  (Arbib  1975,  1981)  provides  a  way  to 
tame  the  complexity  of  large  systems  that  are  to  func¬ 
tion  in  the  real  world,  offering  an  approach  explic¬ 
itly  designed  to  bridge  between  cognitive  science  and 
brain  theory  (BT),  as  well  as  to  contribute  to  dis¬ 
tributed  artificial  intelligence  (DAI).  Schemas  arc 
active  modular  entities,  each  involving  data  struc¬ 
tures  and  control: 

a)  Schemas  serve  to  represent,  at  least,  perceptual 
structures  and  distributed  motor  control.  Schemas  are 
ultimately  defined  by  interaction  with  a  physical 
environment  rather  than  (as  in  most  AI  systems)  by 
cross-references  in  some  logical  formalism. 

b)  Schema  theory  provides  a  distributed  model  of 
computation.  The  brain  can  support  many  concurrent 
activities  for  recognition  of  different  objects,  and  the 
planning  and  control  of  different  activities.  Thus 
schema  theory  views  the  use,  representation,  and 
recall  of  knowledge  as  mediated  through  the 
activity  of  a  network  of  interacting  computing  agents, 
schema  instances.  This  activity  may  involve  passing 
of  messages,  changes  of  state  (including  activity 
level),  instantiation  to  add  new  schema  instances  to 
the  network,  and  deinstantiation  to  remove  instances. 

c)  The  activity  level  of  an  instance  of  a  perceptual 
schema  represents  a  "confidence  level"  that  the  object 
represented  by  the  schema  is  indeed  present;  while 
that  of  a  motor  schema  may  signal  its  "degree  of 
readiness"  to  control  some  course  of  action.  A  schema 
network  does  not,  in  general,  need  a  top-level  executor 
since  schema  instances  can  combine  their  effects  by 
distributed  processes  of  competition  and  cooperation 


Biological  and  Computational  Stereo  Vision 
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Abstract 

A  computational  model  of  stereo  fusion  is  exam¬ 
ined  in  the  light  of  biological  and  psychophysical 
knowledge  of  stereo  vision  in  humans  and  other 
mammals.  Several  analogies  are  suggested,  in¬ 
cluding  the  use  of  independent  spatial-frequency 
channels  with  one-octave  separation,  the  role  of 
vergence  and  the  limits  of  fusion,  sensitivity  to 
vertical  disparity,  and  the  use  of  3  pools  of  dispar¬ 
ity  detectors.  It  is  argued  the  similarity  between 
the  morphology  of  the  visual  cortex  and  the  fine¬ 
grained,  SIMD  architecture  exploited  by  the  com¬ 
putational  model  leads  to  similar  constraints  on 
the  computation  of  stereo  disparity  in  both  mi¬ 
lieus,  and  therefore  naturally  leads  to  processes 
with  similar  properties. 

1  Introduction 

Scientific  investigation  of  stereo  vision  in  humans  and 
other  animals  has  an  vtxtensive  history  in  neurobiology 
and  psychology,  dating  from  Wheatstone’s  discovery  of 
the  phenomenon  in  1838  [19].  Recently  computational 
modelers  have  made  substantial  progress  in  simulating 
the  process  of  stereo  fusion  on  the  computer.  This  pa¬ 
per  examines  one  such  model  in  detail,  and  in  particular 
points  out  some  striking  similarities  between  the  model 
and  current  knowledge  of  stereo  vision  in  higher  mam¬ 
mals. 

1.1  Stereo  geometry 

Stereo  vision  is  a  way  of  interpreting  and  exploiting  vi¬ 
sual  information  that  is  relatively  well  understood,  in 
animals  as  well  as  machines.  The  reason  is  clear;  com¬ 
pared  to  other  perceptual  cues  for  depth,  the  problem  is 
well  defined.  Once  the  images  are  brought  into  point-to- 
point  correspondence,  recovering  the  third  dimension  is 
a  straightforward  a.^plication  of  trigonometry. 

*The  work  described  in  this  srlicle  wm  supported  under 
DARPA  contracu  MDA903-86-C-0084.  DACA76-85-C-0004.  ind 
89F737300.  Use  of  the  Connection  Machine  was  provided  by 
DARPA. 


Figure  I:  Bisic  Stereo  Geometry 

The  geometrical  principle  behind  stereo  vision,  illus¬ 
trated  in  Figure  1,  is  quite  simple.  Assume  that  two 
cameras  form  images  through  left  and  right  centers  of 
perspective  1  and  r,  onto  planes  L  and  R.  (In  practice 
these  would  be  imperfect  optical  lens  systems,  but  for 
this  discussion  we  assume  ideal  ‘pinhole”  projections.) 
Furthermore,  assume  that  the  cameras  are  fixed  upon 
point  V,  which  is  to  say  that  the  two  rays  perpendicu¬ 
lar  to  the  image  planes  passing  through  the  centers  of 
perspective  (the  principle  rays)  intersect  at  v.  Let  9^  be 
the  angle  between  these  principle  rays.  We  say  that  the 
absolute  disparity  of  v  is  .  Now  consider  another  point 
p  projected  onto  image  planes  L  and  R  as  shown,  and 
let  the  angle  between  these  rays  be  9p.  We  say  that  the 
relative  disparity  of  p  with  respect  to  v  is  Spi„  =  9p~9„. 
Relative  disparit.^  is  the  more  commonly  used  definition. 

The  circle  through  I,  r,  and  v  (actually  a  sphere) 
is  called  the  Vietk-Muller  circle  (closely  related  to  the 
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Abstract 

As  its  internal  state  and  external  environment  continu¬ 
ously  change,  any  truly  autonomous  agent  must  cho<Me 
actions  which  are  most  appropriate  to  its  immediate 
circumstances.  This  paper  explores  the  idea  that  neu- 
robiological  design  principles  can  be  applied  to  the  flex¬ 
ible  control  of  autonomous  agents.  We  describe  a  simu¬ 
lated  insect  whose  behavior  is  controlled  by  an  artiflcial 
nervous  system.  In  particular,  we  focus  on  the  neural 
basis  of  two  different  examples  of  behavioral  choice  in 
this  artificial  insect. 

1  Introduction 

One  of  the  most  fundamental  problems  faced  by  any 
agent,  either  natural  or  artificial,  which  must  function 
autonomously  in  the  real  world  is  deciding  what  to  do 
next.  As  both  its  external  environment  and  internal 
state  continuously  change,  an  autonomous  agent  must 
constantly  choose  actions  which  lead  to  global  behav¬ 
ior  most  appropriate  for  the  current  situation.  Broadly 
speaking,  the  problem  of  behavioral  choice  encompasses 
the  entire  spectrum  from  minor  aditistments  of  ongoing 
behavior  to  discrete  switches  between  different  behav¬ 
iors.  In  addition,  it  involves  the  generation  of  groups  of 
related  behaviors  with  the  appropriate  timing  and  se¬ 
quencing  to  accomplish  specific  objectives  (McFarland, 
1981,  pp.  118-121). 

How  should  the  control  system  of  an  autonomous 
agent  be  organized  to  support  such  behavioral  choice? 
Recently,  there  has  been  a  trend  toward  more  dis¬ 
tributed  approaches.  For  example,  Brooks  (1986)  has 
been  exploring  the  subsumption  architecture  for  au¬ 
tonomous  agent  control.  This  architecture  consists  of 
layers  of  task-achieving  behaviors  each  of  which  is  im¬ 
plemented  as  a  network  of  finite  state  machines  aug¬ 
mented  by  timers  and  registers.  Interactions  between 
behaviors  are  handled  by  allowing  machines  in  one  layer 
to  suppress  interactions  between  machines  in  lower  lev¬ 


els.  In  a  similar  vein,  Maes  (1989)  has  proposed  an 
approach  to  action  selection  in  which  a  collection  of 
simple  agents  interact  by  passing  activation  along  a  va¬ 
riety  of  special-purpose  links. 

Our  own  approach  is  grounded  in  the  study  of  the 
neuronal  mechanisms  underlying  the  behavior  of  sim¬ 
pler  natural  animals,  a  field  known, as  neuroetkology 
(Camhi,  1984).  We  have  been  exploring  the  idea  that 
neural  network  control  architectures  for  autonomous 
agents  can  be  designed  using  principles  drawn  directly 
from  biological  nervous  systems.  This  approach  has  the 
advantage  that  more  direct  interactions  between  biolog¬ 
ical  and  artificial  mechanisms  for  autonomous  behavior 
are  possible.  This  style  of  modeling  was  first  proposed 
by  Braitenberg  (1984).  In  this  paper,  we  describe  an 
artificial  nervous  system  we  have  designed  for  control- 
Ibg  the  behavior  of  a  simulated  insect.  The  design 
of  this  insect  is  based  in  part  upon  specific  behaviors 
and  neural  circuits  drawn  from  several  natural  animals. 
We  focus  here  on  two  different  examples  of  behavioral 
choice  in  this  artificial  insect. 

2  The  Artificial  Insect  Project 

The  artificial  insect  project  is  aimed  at  exploring  the 
use  of  neuroethological  principles  to  design  artificial 
nervous  systems  for  controlling  the  behavior  of  com¬ 
plete  autononnous  agents,  an  endeavor  which  we  have 
termed  computational  neuroethology  (Beer,  1990).  We 
have  developed  a  simulated  insect,  a  simulated  envi¬ 
ronment  with  which  it  must  cope,  and  an  artificial  ner¬ 
vous  system  for  controlling  its  behavior.  The  insect  is 
capable  of  locomotion,  wandering,  edge-following,  and 
feeding,  as  well  as  properly  managing  the  interactions 
between  its  various  behaviors.  In  order  to  understand 
behavioral  choice  in  this  artificial  insect,  it  is  essential 
to  understand  the  details  of  its  design,  which  we  briefly 
review  below. 

The  artificial  insect  is  a  two-dimensional  abstraction 
of  a  biological  insect  (see  Figure  5).  Its  body  consists 
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Abstract 

Probabilistic  models  were  developed  to  represent  animals* 
movements.  The  simplest  one  makes  it  possible  to 
compute  the  sinuosity  of  an  animal’s  search  path  and  to 
determme  some  basic  properties  such  as  its  diffusion. 
Applying  this  model  in  the  framework  of  optimal  foraging 
theory  led  us  to  determine  the  sinuosity  value  which 
minifniw<  the  path  length  of  a  central  place  forager. 
More  complex  models,  integrating  cybernetic  controls  of 
the  sinuosity  and  the  velocity  as  a  function  of 
environmental  stimulations,  show  how  animals  can  orient 
themselves  in  relation  to  a  stimulation  gradient  or  exploit 
patchy  environments  using  simple  klino-  and  ortho-kinetic 
mechanisms.  Another  type  of  movement  model  was 
developed  to  study  orientation  mechanisms  based  on  an 
egocentric  spatial  memory. 

1.  Introduction 

Animals  often  exhibit  random  search  paths:  take  for 
example  the  paths  of  foraging  ants,  which  anybody  can 
observe.  This  intrinsic  randomness  does  not  however 
prevent  the  animals  from  orienting  efficiently  towards 
spedfic  goals  and/or  aggregating  in  the  the  most  suitable 
areas  of  their  environment.  To  understand  space-use 
mechanisms  (those  whereby  an  animal  regulates  the  time 


it  spends  in  the  various  areas  of  the  environment)  and 
orientation  mechanisms  (those  whereby  an  animal  moves 
towards  a  spedfic  goal),  it  is  necessary  to  first  model  the 
search  paths.  It  is  afterwards  important  to  determine 
which  environmental  cues  are  relevant  to  animals  and 
which  kinetic  parameters  they  have  to  regulate  to  be 
effident.  Using  modeling  and  computer  simulation  of 
animals’  movements,  we  have  attempted  to  formalize 
some  of  the  mechanisms  involved  in  movement  control.  In 
this  context,  animals  have  been  taken  to  be  probabilistic 
self-directed  mobile  agents. 

Here  we  present  a  general  overview  of  the 
theoretical  studies  we  have  published  over  recent  years  in 
the  field  of  modeling  animals’  movements.  These  models 
deal  with  a  large  range  of  natural  spatial  behaviours,  from 
random  foraging  to  oriented  movements  based  on  spatial 
memory.  Some  of  these  models  link  up  with  the  optimal 
foraging  theory:  they  are  an  attempt  to  determine  which 
movement  strategies  maximize  the  effidency  of  food 
searching  in  a  stochastic  environment.  Other  models  were 
devised  with  a  view  to  explaining  orientational 
performances  on  the  basis  of  elementary  sensorimotor 
probabilistic  mechanisms.  Since  a  mathematical  approach 
would  be  too  complex  to  be  practicable  here,  the 
properties  of  our  models  were  established  using  computer 
simulations. 
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Abstract 

Ethologists  have  identified  many  ways  that  in¬ 
nate  behavioral  primitives  and  predispositions 
facilitate  the  learning  of  complex  behaviors. 

This  paper  applies  some  of  these  insights  to 
classifier  systems.  We  show  how  certain  built- 
in  mechanisms  for  generating  behavior  in  clas¬ 
sifier  systems  provide  an  inductive  bias  that 
can  be  manipulated  to  improve  learning.  The 
effects  of  this  bias  on  learning  are  illustrated 
by  a  classifier  system  that  learns  to  solve  a 
simple  navigation  task. 

1.  Introduction 

Animals  are  born  with  a  large  repertoire  of  innate,  co¬ 
ordinated  patterns  of  muscle  movement  and  behavior 
commonly  referred  to  as  instincts.  Instincts  are  often 
thought  of  as  inflexible,  low-level  “motor  programs”. 
However,  ethologists  have  discovered  that  innate  be¬ 
haviors  in  fact  often  use  learning  as  a  strategy  for  filling 
in  details  that  are  too  complex  to  specify  completely  in 
advance.  While  some  instinctive  behaviors  may  be  rigid 
and  stereotyped,  many  others  are  remarkably  plastic. 

The  relationship  between  instinct  and  learning  is  not 
just  relevant  to  ethologists  however.  Computational 
models  of  adaptive  behavior  can  also  benefit  from  un¬ 
derstanding  the  important  influences  of  prior  structure 
on  learning  and  behavior.  Unfortunately,  the  role  of 
prior  structure  and  innate  rules  of  behavior  is  ignored  in 
most  computational  models  of  adaptive  behavior.  Ex¬ 
ternal  reinforcement  is  usually  viewed  as  the  primary, 
if  not  the  only,  influence  on  learning. 

This  paper  examines  how  prior  structure  or  “in¬ 
stinct”  can  worlc  together  with  reinforcement  in  classi¬ 
fier  systems,  a  rule-based  framework  for  studying  adap¬ 
tive  behavior.  The  next  section  briefly  reviews  a  few 
of  the  relationships  between  instinct  and  learning  that 
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are  evident  in  animal  behavior.  In  subsequent  sections 
we  describe  the  classifier  system  framework  and  show 
how  it  can  be  used  to  implement  computational  models 
of  some  of  these  relationships.  In  particular,  we  show 
how  certain  built-in  mechanisms  for  generating  behav¬ 
ior  provide  an  inductive  bias  that  can  be  manipulated 
to  improve  learning  in  classifier  systems.  The  effects 
of  this  bias  on  learning  are  illustrated  by  a  classifier 
system  that  learns  to  solve  a  simple  navigation  task. 

2.  Instinct  and  Learning 

At  one  extreme,  instinctive  behaviors  can  be  sufficiently 
preordained  and  inflexible  that  they  proceed  to  com¬ 
pletion  automatically  once  they  are  triggered  by  an  ap¬ 
propriate  stimulus.  These  motor  programs  are  called 
fixed  action  patterns  and,  once  initiated,  they  often  re¬ 
quire  little  or  no  external  feedback.  A  classic  example 
of  a  fixed  action  pattern  is  the  egg-rolling  behavior  of 
geese,  which  a  goose  will  complete  even  if  the  egg  is 
taken  away  (Tinbergen,  1951). 

Other  kinds  of  innate  behavior  patterns  are  rigidly 
programmed,  yet  exhibit  a  great  deal  of  “run-time” 
flexibility  (Gould,  1982).  Examples  include  the  con¬ 
struction  of  bird  nests,  beaver  dams,  and  spider  webs. 
These  structures  have  fixed,  species-specific  character¬ 
istics,  but  the  building  behavior  can  adapt  itself  to  a 
wide  variety  of  both  predictable  and  unpredictable  con¬ 
tingencies  in  the  immediate  environment.  Even  more 
sophisticated  examples  of  plasticity  are  evident  in  the 
way  human  infants  learn  to  crawl  and  walk.  One  innate 
component  of  this  behavior  seems  to  be  a  goal-directed 
specification  of  what  to  learn  —  infants  seems  to  have  a 
built-in  sense  of  which  movements  “feel  right”  —  cou¬ 
pled  with  a  motivational  drive  for  repeated  experimen¬ 
tation.  Moreover,  the  learned  movements  show  a  fur¬ 
ther  flexibility  in  the  way  they  recalibrate  themselves  to 
accommodate  the  growth  of  the  body.  These  are  just 
a  few  examples  of  the  many  ways  instinctive  mecha¬ 
nisms  and  predispositions  can  facilitate  the  learning  of 
complex  behaviors. 
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Abstract 

In  recent  years  there  has  been  a  move  within  the  ar¬ 
tificial  intelligence  and  robotics  communities  towards 
building  complete  autonomous  creatures  that  operate 
in  the  physical  world.  Certain  approaches  have  proven 
quite  successful,  and  have  caused  a  re-analysis  within 
the  field  of  artificial  intelligence  of  what  components  are 
necessary  in  the  intellectual  architecture  of  such  crea¬ 
tures.  However  nothing  built  thus  far  yet  comes  close 
the  dreams  that  many  people  hold  dearly.  Further¬ 
more  there  has  been  quite  some  criticism  of  the  new 
approatches  for  lacking  adequate  theoretical  justification. 
In  this  paper  we  outline  some  of  the  more  obvious  chal¬ 
lenges  that  remain  for  these  new  approaches,  and  sug¬ 
gest  new  ways  of  thinking  about  the  tasks  ahead,  in  or¬ 
der  to  decompose  the  field  into  a  number  of  manageable 
sub-areas  that  can  be  used  to  shape  further  research. 

1  Introduction 

There  is  a  growing  interest  in  building  artificial  crea¬ 
tures  of  some  sort.  One  example  is  the  recent  boom 
in  a  field  known  as  Artificial  Life  (see  [Langston  87] 
and  [Langston  90]).  While  much  of  the  emphasis  is  on 
building  forms  resident  in  computers,  which  are  agents 
acting  in  an  information  domain,  there  has  also  been 
some  interest  in  physical  embodiments  of  artificial  crea¬ 
tures. 

This  author,  at  the  MIT  AI  Lab,  introduced  the  sub¬ 
sumption  architecture  ([Brooks  86]  and  extended  in 
[Brooks  90])  with  the  explicit  goal  of  building  mo¬ 
bile  robots  with  long  term  autonomy.  Later  the  word 
creature  crept  into  the  language  of  the  MIT  group  (e.g., 
[Connell  87]).  The  goal  is  to  build  autonomous  mobile 
robots  which  operate  over  long  periods  of  time,  com¬ 
pletely  autonomously,  in  dynamic  worlds.  It  is  envi¬ 
sioned  that  these  worlds  are  worlds  which  already  exist 
for  some  other  purpose — not  worlds  specially  built  to 
house  the  robots.  Further,  it  is  envisioned  that  these 
robots  carry  out  some  task  which  has  some  utility  for 
whoever  wanted  the  robots  to  exist  and  live  in  this 


world. 

As  [Flynn  87]  points  out,  there  are  many  compo¬ 
nents  to  such  creatures,  including  sensors,  actuators, 
power  sources,  and  intelligence.  Over  the  last  five  years 
we  have  found  that  all  these  components  are  intimately 
related  as  we  have  tried  to  build  prototype  creatures 
([Flynn  and  Brooks  89]).  Choices  in  any  part  of  the 
system  architecture  (e.g.,  sensor  characteristics)  have 
major  impacts  upon  other  parts  of  the  system.  In  gen¬ 
eral  it  is  very  dangerous  to  think  that  any  one  compo¬ 
nent  (such  as  intelligence)  can  be  isolated  and  studied 
by  itself. 

Our  experience  with  the  subtleties  of  such  interactions 
has  led  us  to  our  current  construction  of  a  very  complex 
robot,  named  Attila  ([Angle  and  Brooks  90])  pic¬ 
tured  in  figure  I  (in  fact  we  are  building  multiple  copies 
of  Attila).  It  has  six  legs,  each  with  three  degrees  of 
freedom,  an  active  whisker,  a  gyro  stabilized  pan-tilt 
head  carrying  a  range  finder  and  a  CCD  camera,  10  on¬ 
board  processors,  and  over  150  sensors.  We  built  an 
earlier  six  legged  robot  named  Genghis  ([Angle  89], 
[Brooks  89],  but  its  complexity  pales  in  comparison  to 
that  of  Attila.  Many  of  the  issues  raised  in  this  paper 
were  brought  to  our  attention  as  we  have  tried  to  work 
out  how  to  program  this  complex  robot  Attila  to  be  an 
artificial  creature. 

The  bulk  of  this  paper  is  devoted  to  the  problems  and 
challenges  in  designing  and  building  the  computational 
architectures  for  such  creatures.  However,  the  reader 
should  not  forget  that  the  other  aspects  of  a  creature’s 
architecture  cannot  be  considered  in  isolation  from  in¬ 
telligence.  In  a  complete  design,  all  aspects  greatly  in¬ 
fluence  each  of  the  others. 

We  first  argue  that  there  are  multiple  levels  of  analy¬ 
sis  or  abstraction  with  which  we  must  be  concerned  in 
designing  and  building  complete  creature  architectures. 
There  can  be  no  single  magic  bullet  or  theory  which  will 
tell  us  all  we  need  to  know.  Some  problems  within  these 
levels  are  well  circumscribed  and  so  can  be  worked  on  in 
isolation.  However,  in  order  to  build  complete  creatures 
we  need  to  bridge  the  gaps  between  these  levels  also. 

The  bulk  of  the  paper  then  goes  on  to  examine  each 
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Abstract 

An  experiment  has  been  set  up  to  explore  the 
hypothesis  according  to  which  the  solution 
of  conflicts  of  motivations  is  reached  by  the 
trend  to  maximize  pleasure.  Subjects  were 
placed  in  a  situation  of  conflict  where  the 
pleasure  of  playing  a  videogame  clashed 
with  the  increasing  discomfort  of  a  cold 
environment.  The  time  lapse  tolerated  by 
each  of  the  subjects  could  be  prediaed  from 
the  algebraic  sum  of  the  rating  of  displea¬ 
sure  aroused  by  the  cold  environment  and 
the  rating  (rf"  pleasure  aroused  by  the  video- 
game.  obtained  in  different  sessions.  This 
result  supports  the  working  hypothesis  and 
perrmits  the  conclusion  that  pleasure  is  the 
common  currency  which  allows  tradeoffs 
among  various  motivations. 


1.  Models. 

One  may  identify  two  types  of  models  those 
describing  the  behavior  of  a  given  system 
and  those  describing  the  system  itself.  The 
first  type  of  model  is  pragmatic.  A  good 
behavioral  model  is  adequate  enough  when 
it  replicates  the  behavior,  and  is  able  to 
predict  future  behavioral  rest»nses  of  the 
system.  Yet  this  may  be  achieved  without 
any  knowledge  ot  the  intimate  mechanisms 
that  produce  the  behavior.  In  the  same  way 
as  a  given  envelope  function  can  be  ap¬ 
proximated  with  various  summations  of  dif¬ 
ferent  functions,  a  model  may  ignore  the 


inside  of  the  black-box  system  whose  beha¬ 
vior  it  nevertheless  replicates  adequately. 
The  model  may  lump  several  functions  into 
one.  incorporate  approximations,  ignore 
some  rare  or  extreme  conditions  but  remain 
nevertheless  a  good  model. 

The  second  type  of  model  is  more  ambi¬ 
tious  and  aims  at  theoretical  description  of 
the  system  This  type  of  model  is  more  diffi¬ 
cult  to  achieve  because  it  includes  the  inti¬ 
mate  laws  of  the  system  and  because  it  im¬ 
plies  the  knowledge  of  the  structures  and 
functions  of  the  constituent  elements  of  the 
system.  Eventually  this  second  type  also 
reaches  the  same  goal  as  models  of  the  first 
type. 

In  an  enlightening  chapter  on  ins- 
tina  and  motivation  Epstein  (1982)  has  re¬ 
cently  reflected  that  there  are  behaviors 
which  are  simple  reflexes  in  the  Cartesian 
and  Sherringlonian  acceptation  of  the  word, 
but  there  are  also  complex  behaviors.  The 
latter  occur  with  the  cooperative  action  of 
an  endogenous  component,  an  acquired 
component,  and  a  reaaive  component.  He 
stressed  that  «we  need  ccaicepts  that  take 
account  of  the  complexities  of  behaviors 
that  are  not  reflexive».  Theoretical  models 
of  complex  behavior  must  include  these  three 
components.  This  implies  the  recognition 
and  the  incorporation  of  emergent  proper¬ 
ties  and  functions  within  the  central  ner¬ 
vous  system,  i.e.  functions  which  cannot  be 
predicted  from  the  sum  of  properties  of  the 
elements  of  the  central  nervous  system  it¬ 
self.  Excellent  theoretization  on  this  point 
will  be  found  in  Toates  (1986a). 
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Abstract 

An  evolutionary  method  based  on 
selective  reproduction  and  random 
mutation  was  used  to  evolve  neural 
networks  that  control  two  types  of 
simple  organisms  which  can  reach  for 
objects  using  their  2-segment  arm. 
One  kind  of  organism  does  not  move 
and  can  only  capture  an  object  if  it  is 
at  reaching  distance;  the  other  can 
displace  itself  and  therefore  it  first 
approaches  an  object  and  then 
captures  it.  Individual  learning  during 
lifetime  to  predict  changes  in  the 
position  of  an  object  or  of  the  hand 
relative  to  the  organism's  body  helps 
in  the  evolution  of  the  object  reaching 
capacity,  although  inheritance  of  the 
weight  matrix  is  strictly  Darwinian. 
Finally,  a  more  sofisticated  fitness 
criterion  which  penalbies  arm 
movements  causes  the  more  complex 
organism  to  move  its  arm  only  when 
an  object  is  at  reaching  distance. 


I.  Introduction 

Our  purpose  in  the  present  paper  is  to 
describe  an  attempt  at  evolving  simple 
simulated  organisms  that  have  the  capacity 
to  reach  for  objects  using  their  single  2- 
segment  arm.  VVe  will  describe  two  such 
organisms.  One  does  not  move  and  can  only 
reach  for  objects  if  the  objects  are  located  at 
reaching  distance  from  the  organism.  The 
second  organism  can  displace  itself  in  space 
and  therefore  can  approach  objects  and, 
when  they  are  at  reaching  distance,  it  can 
reach  for  them  with  its  arm. 


ITie  methodology  we  have  used  to  develop 
such  organisms  is  an  evolutionary  technique 
based  on  selective  reproduction  and  random 
mutation  (Holland,  1975;  Goldberg,  1989). 
(For  other  approaches  to  developing  similar 
behaviors  see  Jordan,  1989;  Booker,  1988; 
Patamello  and  Camevali,  1989.)  An  initial 
population  of  organisms  each  randomly 
dilTcrcnt  from  all  others  is  created.  Each 
organism  lives  in  its  individual  environment 
which  contains  a  number  of  objects.  At  the 
end  of  their  life  they  are  rank  ordered  based 
on  their  performance  on  the  object  reaching 
tasks.  There  is  no  learning  of  this  task  during 
life  but  p-.-formances  vary  because  of 
chance.  Only  the  best  performing  organisms 
are  allowed  to  reproduce  by  generating  a 
number  of  copies  of  themselves  while  the 
others  estinguish  without  leaving  any 
offspring.  A  small  amount  of  random 
variation  is  added  to  the  copies  so  that  some 
offspring  will  result  in  a  better  individual  and 
some  in  a  less  good  individual  than  their 
common  parent.  However,  selective 
reproduction  will  insure  that  a  better 
olTspring  is  more  likely  to  reproduce  than  a 
less  good  one.  The  net  result  of  this 
evolutionary  process  is  that  the  capacity  to 
reach  for  objects  gradually  increases  across  a 
number  of  generations. 

Another  purpose  of  this  research  is  to 
examine  how  the  performance  that  emerges 
evolutionarily  can  be  controlled  and  shaped 
by  appropriately  manipulating  the  fitness 
criterion,  that  is,  the  criterion  in  terms  of 
which  individuals  are  rank  ordered  and 
which  therefore  dictates  who  will  reproduce 
and  who  won't.  In  the  second  simulation 
with  the  organism  which  both  displaces  itself 
in  space  and  moves  its  arm,  we  will  show 
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Abstract 

This  paper  questions  approaches  to  computa¬ 
tional  modeling  of  neural  mechanisms  underlying 
behaviour.  It  examines  the  “simplifying”  (con- 
nectionist)  models  used  in  computational  neu¬ 
roscience  and  concludes  that,  unless  embedded 
within  a  sensorimotor  system,  they  are  meaning¬ 
less.  The  implication  is  that  future  models  should 
be  situated  within  closed-environment  simulation 
systems;  output  of  the  simulated  nervous  system 
is  then  expressed  as  observable  behaviour.  This 
approach  is  referred  to  as  “computational  neu¬ 
roethology”.  Computational  neuroethology  of¬ 
fers  a  firmer  grounding  for  the  semantics  of  the 
model,  eliminating  subjectivity  from  the  result- 
interpretation  process.  A  number  of  more  fun¬ 
damental  implications  of  the  approach  are  also 
discussed,  chief  of  which  is  that  insect  cognition 
should  be  studied  in  preference  to  mammalian 
cognition. 

1  Introduction 

This  paper  questions  approaches  to  computational  mod¬ 
eling  of  the  neural  mechanisms  underlying  behaviour.  It 
examines  the  relationship  between  computational  neuro¬ 
science  [26]  amd  that  style  of  modeling  popularly  referred 
to  as  “ccnnectionism” ,  “parallel  distributed  processing”, 
or  “neural  networks”*  which  has  recently  been  subject  to 
renewed  attention  in  the  fields  of  cognitive  science  and 
artificial  intelligence  (see  e.g.  [25,  23]). 

Connectionist  models  are  characterised  by  their  sim¬ 
plified  nature  and  concomitant  inattention  to  biological 
data,  and  it  is  argued  here  that  such  “simplifying”  com¬ 
putational  neuroscience  has  serious  inadequacies.  dif¬ 
ferent  approach  is  suggested  which  pays  far  more  atten¬ 
tion  to  the  sensorimotor  system  and  hence  to  behavioural 
interactions  with  the  external  environment.  This  ap¬ 
proach  involves  computational  modeling  of  the  neural 
mechanisms  underlying  behaviour,  in  a  manner  akin  to 
that  used  in  connectionism.  Such  an  analysis  of  be¬ 
haviour  as  a  product  of  neural  activity  is  properly  the 

*In  thi*  paper,  these  three  terms  will  be  created  as  synoiOTnous. 
and  referred  to  collectively  as  “connectionist"  models. 


domain  of  the  field  of  neuroethology,  and  the  new  ap¬ 
proach  is  therefore  referred  to  as  “computational  neu¬ 
roethology”.  Meaning  is  supplied  to  the  models  by  em¬ 
bedding  them  in  simulated  environments  which  supply 
visual  feedback  without  human  intervention,  that  is  they 
close  the  external  feedback  loop  from  motor  output  to 
sensory  input. 

The  advantage  of  computational  neuroethology  is  that 
the  semantics  of  the  network  are  well  grounded,  and 
thus  results  are  generated  by  observation  rather  than 
by  interpretation.  That  is,  the  fruits  of  computational 
neuroethology  simulations  are  “hard”  objective  measure¬ 
ments  rather  than  “soft”  subjective  ones.  At  a  metatheo- 
retical  level,  it  is  argued  that  the  computational  network 
simulation  of  cognitive  processing  should  pay  much  more 
attention  to  the  evolutionary  history  of  those  faculties  it 
wishes  to  replicate.  In  particular,  a  conclusion  of  ihis 
paper  is  that  the  study  of  linguistic  processes  using  net¬ 
work  models  is  wildly  premature.  The  study  of  insects  is 
advocated  as  the  most  fruitful  path  for  future  research 

As  the  reader  will  probably  already  have  detected,  this 
paper  is  intentionally  polemic.  It  is  aimed  at  an  inter¬ 
disciplinary  audience,  and  the  author  is  no  polymath- 
For  that  reason,  this  paper  is  offered  as  a  provisional 
manifesto  in  the  hope  that  it  provokes  some  interesting 
discussion.  The  argument  is  based  on  previous  work  by 
a  number  of  authors.  Because  of  its  disputatious  nature, 
there  are  more  direct  quotes  in  this  paper  than  is  com¬ 
mon.  There  is  no  denying  that  this  is  a  selective  review 
of  the  literature.  This  paper  is  abridged  from  [9]. 

The  paper  opens  with  a  discussion  of  computational 
neuroscience,  distinguishing  it  from  neural  engineering, 
and  identifying  two  classes  of  model:  realistic  and  sim¬ 
plifying.  Following  this,  the  connectionist  paradigm  is 
briefly  summarised.  Ne.xt,  criticisms  of  connectionism 
are  discussed,  with  particular  attention  to  the  argument 
that  connectionist  models  have  no  semantic  ground¬ 
ing  without  behavioural  linkage  to  a  sensorimotor  sys¬ 
tem.  Then,  a  remedy  to  this  objection  is  proposed: 
the  adoption  of  the  computational  neuroeihology  ap¬ 
proach.  Computational  neuroethology  is  defined,  and 
a  specific  technique  for  providing  a  behavioural  linkage 
is  discussed.  This  approach  has  some  important  impli¬ 
cations  for  future  research,  the  most  significant  of  which 
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Abstract 

Studies  in  computer  vision  have  only  recently 
realised  the  advantage  of  adding  a  behavioural 
component  to  vision  systems,  enabling  them  to 
make  programmed  ‘eye  movements’.  Such  an  an¬ 
imate  t’ision  capability  allows  the  system  to  em¬ 
ploy  a  nonuniform  or  foveal  sampling  strategy, 
with  gaze-control  mechanisms  repositioning  the 
limited  high-resolution  area  of  the  visual  held. 

The  hoverh/  Jyritia  pipienj  is  an  insect  that  ex¬ 
hibits  foveal  animate  vision  behaviour  highly  sim¬ 
ilar  to  the  corresponding  activity  in  humans.  This 
paper  discusses  a  simulation  model  of  Syriiia  cre¬ 
ated  for  studying  the  neural  processes  underlying 
such  visually  guided  behaviour.  The  approach 
differs  from  standard  “neural  network”  model¬ 
ing  techniques  in  that  the  simulated  Syritia  ex¬ 
ists  within  a  closed  simulated  environment,  i.e. 
there  is  no  need  for  human  intervention:  such 
an  approach  is  an  example  of  computational  neu¬ 
roethology. 

1  Introduction 

For  an  animal  (or  an  autonomous  robot)  to  adapt  and 
survive  in  uncertain  environments,  a  sense  of  sight  is 
undoubtedly  a  useful  thing  to  have.  The  creation  of 
seeing  machines  has  been  the  topic  of  much  research 
in  artificial  intelligence  and  computer  vision.  Unfortu¬ 
nately,  most  such  research  has  ignored  the  behavioural 
contexts  in  which  natural  vision  occurs.  Recently,  a  re¬ 
search  paradigm  known  as  animate  vision  has  emerged, 
where  the  seeing  machine  is  given  the  ability  to  make 
programmed  ‘eye  movements’,  allowing  it  to  look  around: 
animate  vision  acknowledges  the  behavioural  contexts  of 
natural  vision. 

One  such  context  is  the  need  to  control  gaze  when  the 
image  sampling  strategy  uses  nonuniform  resolution,  i.e. 
where  only  a  restricted  area  of  the  field  of  view  is  high- 
acuity.  as  in  foveal  vision  commonly  found  in  predatory 
animals.  Foveal  vision  offers  a  number  of  advantages  for 
any  real-time  visual  system,  whether  artificial  or  natural 
(robot  or  animal).  Animate  foveal  vision  is  not  an  e.x- 


clusively  mammalian  trait.  The  hoverfly  Syritta  pipiens 
exhibits  animate  foveal  vision  behaviour  which  is  remark¬ 
ably  similar  to  corresponding  behaviour  in  humans. 

The  research  project  described  here  is  a  ‘neural  net¬ 
work’  simulation-model  study  of  the  mechanisms  under¬ 
lying  animate  vision  in  Syritta.  The  study  of  the  neural 
basis  of  behaviour  is  properly  the  domain  of  the  field 
of  neuroethology,  and  this  project  (relying  as  it  does  on 
computer  simulation)  is  thus  a  form  of  “computational 
neuroethology” . 

The  simulation  models  the  hoverfly  living  in  a  closed 
dynamic  environment:  activity  in  the  model  nervous  sys¬ 
tem  is  expressed  as  flight  behaviour  of  the  model  fly, 
which  in  turn  generates  new  visual  input  for  the  sim¬ 
ulated  eyes,  which  feed  the  model  nervous  system,  thus 
completing  a  visual  feedback  loop  from  sensorimotor  out¬ 
put  to  photoreceptor  input.  The  model  nervous  system 
is  not  ‘hard-wired’  but  is  created  using  techniques  from 
current  ‘connectionist’  network  learning  theory  and  from 
adaptive  filter  theory.  The  method  by  which  the  network 
is  created  is  incidental:  it  is  the  capability  of  the  mature 
network  that  is  of  interest.  A  number  of  different  strate¬ 
gies  for  creating  the  network  are  being  explored;  the  opti¬ 
mal  solution  (in  the  engineering  sense)  is  not  necessarily 
the  most  biologicedly  interesting. 

This  paper  gives  a  brief  review  of  the  past  research  on 
which  this  project  builds,  and  then  presents  an  overview 
of  the  simulator  system.  The  project  draws  on  litera¬ 
ture  in  a  number  of  fields:  a  more  complete  account  of 
the  background  literature  and  the  simulator  is  given  in 
(13],  which  this  paper  is  abridged  from.  Computational 
neuroethology  [12]  is  discussed  here  only  in  passing. 

The  work  is  at  an  early  stage.  So  far  the  most  signifi¬ 
cant  results  are  in  the  design  of  the  simulator,  especially 
the  generation  of  the  view  through  Syritta's  eyes. 

2  Rationale 

Work  of  this  nature  does  not  have  a  long  academic  pedi¬ 
gree.  Therefore,  the  notes  below  concentrate  on  a  few 
papers  in  some  depth,  rather  than  superficially  skim¬ 
ming  many.  The  animate  vision  paradigm  is  described, 
followed  by  a  discussion  of  nonuniform  sampling.  The 
argument  underlying  the  interest  in  insects  is  then  re- 
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Abstract 

A  statistical  method  for  quantifying, 
summarizing,  and  evaluating  information 
about  the  behaviour  of  a  living  system  is 
illustrated  using  data  from  a  study  of  the 
ontogenetic  development  of  preferences  in 
guppies  fPoedlia  ifitkulata)-  This  method 
permits  comparisons  of  groups  of  individuals 
m  different  treatments  when  the  data  on  each 
mdividual  consists  of  a  long  sequence  of 
behavioural  states  together  with  the  entry  time 
into  those  states.  This  tool  is  readily  adaptable 
to  the  study  of  a  simulation  of  an  organism, 
set  of  organisms,  or  the  comparison  of  robotic 
models  with  living  systems. 


1.  Introduction 

The  use  of  robotic  systems  to  model  living  organisms 
has  had  varying  degrees  of  success  in  expanding  our 
knowledge  of  living  systems.  Generally,  the  exercise  is 
fruitful  when  relevant  aspects  of  the  livin^systcm  are 
modelled  faithfully.  It  is  important,  therefore,  to  have 
tools  for  comparing  the  behaviour  of  robotic  systems 
with  the  living  systems  they  model. 

We  present  here  one  example  of  such  a  tool;  a 
statistical  method  for  quantifying,  summarizing,  and 
evaluating  information  about  the  behaviour  of  a  living 
system.  We  illustrate  this  statistical  tool  by  a  study  of 
preferences  in  guppies  (Poecilia  reticulatal  for  food  and 
conspeciflcs.  The  tool  is  readily  adaptable  to  the  study 
of  a  simulation  of  an  organism  or  set  of  organisms,  or 
to  the  comparison  of  robotic  models  with  living  systems. 

Guppies  (Poecilia  reticulatal  are  an  intensively 
studied  species  for  many  aspects  of  behaviour.  Pilot 


studies  in  our  laboratory  indicated  signiHcant  variation 
among  both  male  and  female  guppies  in  their  preferences 
for  stimuli  of  food  and  conspe^cs.  Such  individual 
variation  has  increasingly  attracted  the  attention  of 
behaviourists  (e.g.  for  fish  see  Magurran,  1986;  Gotceitas  & 
Colgan,  1988)  with  respect  to  its  basis  and  function.  From 
an  ontogenetic  perspective,  such  variation  raises  questions 
concerning  the  role  of  social  isolation  in  its  development, 
sexual  differences  in  this  ontogeny,  and  the  extent  to  which 
adult  behaviour  is  predicted  by  juvenile  behaviour. 

Ontogenetic  research  on  guppies  has  included  studies 
on  the  development  of  stimulus  preferences  (Candland  & 
Milne,  1966),  avoidance  of  predators  (Goodey  &  Liley, 
1986),  temperature  preferences  (Johansen  &  Cross,  1980), 
sexual  activities  (Liley  &  Wishlow,  1974),  and  the  role  of 
stress  and  rearing  conditions  (Pinckney  &  Anderson,  1967; 
Newton,  1982).  Germane  to  the  present  work  are  the 
following  findings  from  these  studies,  isolated  fish  were  less 
active  than  controls  and  that  activity  decreased  over  the 
duration  of  the  observational  trial  (Newton,  1982).  The 
isolates  scored  higher  frequencies  of  social  display  and 
lower  levels  of  sexual  activity.  Liley  (1966)  found  that  as 
experience  was  gained  with  females,  previously  isolated 
male  guppies  showed  decreased  levels  of  display.  Pinckney 
and  Anderson  (1967)  observed  that  group-reared  Tish  spent 
a  decreasing  amount  of  time  near  the  stimulus  fish  in 
contrast  to  the  isolated  Hsh  that  spent  an  increasing  amount 
of  time  near  the  stimulus  fish.  The  group-reared  Hsh 
showed  a  preference  for  the  stimulus  Hsh  of  the  same  sex 
while  the  isolated  fish  showed  no  signiHcant  preference  for 
either  sex  of  stimulus  Hsh. 

All  of  these  studies  examined  the  behaviour  of  mature 
adult  Hsh  as  the  outcome  of  various  manipulations.  The 
objective  of  the  present  study  was  to  monitor  at  frequent 
intervals  the  stimulus  preferences  and  activity  of  guppies 
throughout  the  juvenile  and  early  adult  periods,  and  to 
determine  the  extent  and  ontogeny  of  individual  differences 
in  each  sex.  Individuals  were  raised  as  experimental  Hsh 
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Abstract 

We  are  interested  in  simulations  of  biological 
evolution,  i.e.  simulations  of  populations  of  or¬ 
ganisms  over  many  generations  living  in  a  com¬ 
plex  and  dynamic  environment.  Our  simula¬ 
tions  are  microanalytic,  meaning  that  each  in¬ 
dividual  organism  and  gene  is  separately  repre¬ 
sented,  and  the  biologically  significant  events  in 
an  organism’s  life  are  ail  separately  simulated  in 
detail.  Although  we  have  been  successful  with 
simple  models,  we  have  encountered  fundamen¬ 
tal  difficulties  when  scaling  up  the  complexity 
of  the  organisms  and  the  complexity  of  behav¬ 
iors  we  expect  of  them.  These  difficulties  all 
lead  to  a  single  question:  What  is  an  appropri¬ 
ate  representation  for  an  organism,  i.e.  what  is 
an  appropriate  programming  paradigm  in  which 
to  express  the  complex  behavior  of  organisms, 
and  how  should  such  programs  be  encoded  into 
strings  so  that  genetic  algorithms  will  be  suc¬ 
cessful  over  them? 

The  project  that  brought  these  issues  to 
the  surface  is  a  complex  evolutionary  simulation 
called  AntFarm,  in  which  we  arc  attempting  to 
evolve  cooperative  foraging  behavior  in  a  popu¬ 
lation  of  colonies  of  artificial  '‘ants.”  In  this  pa¬ 
per  we  survey  a  number  of  candidate  represen¬ 
tations  for  organisms,  that  we  have  considered 
for  AntFarm,  all  of  which  have  been  used  in  the 
past  for  simple  evolution  models.  We  show  that 
none  of  the  representations  are  well-suited  for 
AntFarm.  From  their  inadequacies  we  abstract 
a  number  of  principles  that  we  believe  are  neces¬ 
sary  for  successful  evolution  of  complex  artificial 
life.  Finding  a  representation  that  Itas  all  of  the 
properties  we  identify  is  still  an  open  problem. 

1  Introduction 

The  simulation  of  evolving  populations  of  artificial  organ¬ 
isms  is  very  important  in  the  study  of  ecological,  adaptive. 


and  evolutionary  systems  whose  dynamics  are  too  com¬ 
plex  to  study  analytically  or  experimentally  [11].  In  this 
paper  we  consider  simulating  organisms  that  live  and  re¬ 
produce  in  relatively  complex  environments,  with  many 
sensors  (external  and  internal),  and  many  possible  actions 
at  each  moment.  In  addition  the  organisms  possess  some 
amount  of  internal  memory,  allowing  their  behavior  to  be 
history  sensitive.  In  the  course  of  its  life  each  organism 
is  born,  makes  thousands  of  decisions  (eat,  move,  mate, 
etc.),  and  eventually  dies.  The  reproductive  success  of  a 
particular  organism  is  affected  by  its  behavior  throughout 
its  lifetime. 

Our  simulations  are  microanalytic,  meaning  that  each 
individual  organism  and  gene  is  separately  represented, 
and  the  biologically  significant  events  in  an  organism’s 
life  are  all  separately  simulated  in  detail.  Each  organ¬ 
ism  in  the  evolving  population  is  separately  represented 
as  a  program.  Each  organism’s  life  is  represented  as  a 
process,  a  detailed  sequence  of  events  including  its  birth, 
its  interaction  with  a  dynamic  environment,  its  competi¬ 
tion  with  other  organisms,  its  mating  and  reproduction 
(if  any),  and  its  death. 

The  structural  representation  of  an  organism  consists 
of  the  following  parts: 

•  interpreter:  used  to  execute  organism  behavior  func¬ 
tions  (programs); 

•  phenotype:  the  behavior  function  (program),  that 
maps  sensory  inputs  and  memories  into  memories 
and  effector  outputs; 

•  genotype:  a  bitstring  that  encodes  the  behavior 
function; 

•  development  function:  the  mapping  that  decodes  the 
genotype  to  produce  the  phenotype. 

In  all  of  our  experiments,  the  development  function  is 
fixed  for  all  organisms  and  for  all  time;  it  is  not  subject 
to  evolution.  The  genotype,  of  course,  differs  from  an¬ 
imal  to  animal,  but  is  static  throughout  the  life  of  the 
organism;  it  is  the  genetic  material  used  in  reproduction. 
At  the  time  of  reproduction,  recombination  and  muta¬ 
tion  operators  are  applied  to  a  pair  of  parent  genotypes 
to  produce  an  offspring  genotype.  The  phenotype  of  an 
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ABSTRACT 

When  a  robot  has  to  move  in  a  locally  uncertain  environment,  propulsion  by 
means  of  walking  legs  is  advantageous  compared  to  a  wheel  driven  system. 
However,  the  control  of  walking  legs  is  more  complicated.  The  question  of 
how  the  movement  of  the  different  legs  is  coordinated  will  be  investigated 
here.  Three  different  solutions  which  have  been  developed  during  natural 
evolution  will  be  compared.  These  are  the  walking  systems  of  an  insect,  of 
a  Crustacea,  and  of  a  mammal.  The  results  show  that  coordinating 
mechanisms  differ  considerably  in  these  animals. 


1.  INTRODUCTION 

An  autonomous  robot  which  has  to  move 
in  an  uncertain  environment  has  to  deal 
with  the  problem  of  how  to  perform  a 
goal-oriented  behaviour.  This  problem 
is  a  global  one,  meaning  that  the  robot 
has  to  deal  with  the  detection  of 
possible  paths  and  to  decide  which  of 
these  it  should  take.  This  includes  the 
problem  of  obstacle  avoidance.  When  the 
environment  is  "locally  certain”,  as 
for  example  a  semi -artificial 

environment  with  flat  surfaces,  the 
technical  problem  of  how  to  move  the 
body  forward  is  comparatively  simple 
and  can  be  solved  by  driving  the  robot 
with  wheels.  If,  however,  the  local 
structure  of  the  environment  is 
uncertain,  i.e.,  consists  of  irregular 
terrain,  a  robot  with  walking  legs  is 
advantageous.  Several  attempts  have 
been  made  to  construct  such  a  walking 
machine.  Nevertheless,  comparing  the 
walk  of  a  six-legged  robot  with  that  of 
an  animal  such  as  an  insect, 
immediately  reveals  differences.  The 
walking  of  an  animal  is  much  more 
versatile,  and  it  appears  to  be  more 
efficient  and  elegant.  Thus  it  is 
useful  to  consider  biological  control 
mechanisms  in  order  to  apply  these  or 


similar  mechanisms  to  the  control  of 
walking  legs  in  machines.  In  the  past, 
engineers  have  pointed  out  that  little 
information  is  available  on  the 
biological  control  mechanisms,  but  this 
situation  has  changed  recently.  This 
paper  provides  a  summary  of  the  recent 
biological  results  focussing  on  data 
obtained  from  insects,  crustaceans,  and 
cat  by  means  of  behavioural  methods. 

2.  CONTROL  OF  THE  INDIVIDUAL  LEG 

The  results  to  be  described  here 
mainly  concern  the  coupling  mechanisms 
between  legs,  i.e.  those  mechanisms 
that  produce  a  proper  coordination  of 
the  walking  legs  even  when  walking  is 
disturbed.  However,  for  this  purpose  it 
is  necessary  also  to  consider  briefly 
how  the  movement  of  an  individual  leg 
is  controlled.  The  mechano-neuronal 
system  that  produces  this  movement 
might  be  called  the  step  pattern 
generator.  To  avoid  a  possible 
misunderstanding  it  should  be  stressed 
that  it  is  completely  open  whether  this 
step  pattern  generator  contains  an 
endogenous  central  pattern  generator. 

For  simplicity,  only  forward  walking 
will  be  discussed.  The  cyclic  movement 
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Abstract 

The  purpose  of  this  paper  is  threefold.  To  analyze  an 
adaptation  anomaly  observed  in  a  specific  genetic 
algorithm  designed  to  optimize  robot  trajectories,  to 
propose  an  explanation  for  this  unusual  adaptive 
behaviour  by  drawing  an  analogy  with  some  elementary 
mechanisms  in  nature,  and  to  suggest  that  a  much  more 
robust  adaptive  strategy  is  to  allow  concurrent 
adaptation  of  both  the  information  content  and  the 
representation  structure  by  the  genetic  plan.  This  latter 
strategy  results  in  the  optimization  of  the  ability  to 
adapt. 

This  paper  suggests  that  when  artificial  life  or 
machine  learning  applications  attempt  to  capture  the 
essence  of  natural  adaptation,  it  is  important  they  allow 
selection  to  operate  on  all  levels  of  the  system. 
Furthermore,  it  is  essential  to  expose  both  the  structure 
of  the  system  and  its  information  content  to  selection. 


1  A  Genetic  Algorithm  Designed  to 
Optimize  Robot  Tnycctories 

Most  robot  applications  are  based  on  a  motion  trajectory 
composed  of  a  movement  sequence  of  a  robot  arm. 
Mechanically,  a  robot  arm  is  an  open  kinematic  chain 
comprising  relatively  stiff  links  with  a  joint  between 

adjacent  links.  Each  link  represents  one  degree  of  freedom 

% 

and  can  be  commanded  to  move  independently  of  all  other 
links.  Standard  systems  have  six  degrees  of  freedom  to 
obtain  full  spatial  flexibility.  Since  a  robot  arm  performs  a 
task  through  the  motion  of  its  cnd-effccior  aiuched  to  the 
last  link,  the  last  link  is  the  primary  component  of  the 
whole  structure. 

An  arm-cor^iguration  is  a  unique  arm  structure  defined 
by  a  set  of  li,k  positions  (Fig.  1).  Given  the  positions,  the 
end-effector’s  position  is  uniquely  determined.  A  robot 
uajectory  is  defined  by  specifying  a  sequence  of  spatial 
positions  the  end-effector  is  required  to  visit.  Thus,  any 


given  path  is  approximated  with  a  finite  number  of  intervals 
specified  by  the  motion  vectors  joining  the  discrete  end- 
effector  positions.  In  other  words,  any  sequence  of  end- 
effector  positions  (>2)  defines  a  valid  trajectory  which  will 
approximate  the  desired  path  with  a  measurable  error. 


Fig.  1  -  3-link  planar  robot  arm  commanded  to  a  fully  stretched 
position,  but  the  end-effector  exhibits  a  steady-state 
positioning  deviation  from  the  horizontal  due  to  limited 
positioning  control  accuracy. 

Therefore,  the  optimization  of  a  robot  trajectory  means 
the  identification  of  both  the  optimum  combination  and 
number  of  end-effector  positions,  and  that  means  a  great 
many  alternative  trajectories  that  should  be  considered.  The 
size  of  the  trajectory  space  grows  substantially  even  further 
when  the  robot  used  is  of  a  redundant  structure  and  most 
end-effector  positions  instantiate  a  multitude  of  different 
arm<onfigurations. 

The  complexity  of  programming  a  trajectory  can  be 
appreciated  by  examining  the  vertical  plane  in  which  the 
end-effector  is  required  to  follow  the  straight  line  connecting 
points  A  and  B  (Fig.  2a}.  One  robot  trajectory  may  be 
specified  by  sites  1  and  2  the  end-effector  should  visit  (Fig. 
2b),  while  another  trajectory  might  consider  sites  3, 4,  S  and 
6  as  an  alternative  specification  (Fig.  2c).  The  performance 
resulting  from  the  different  trajectories  might  be  quite 
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Abstract 

A  distributed  sorting  algorithm,  inspired  by  how 
ant  colonies  son  their  brood,  is  presented  for  use 
by  robot  teams.  The  robots  move  randomly,  do  not 
communicate,  have  no  hierarchical  organisation, 
have  no  global  representation,  can  only  perceive 
objects  just  in  front  of  them,  but  can  distinguish 
between  objects  of  two  or  more  t>'pes  wi(h  a  cer¬ 
tain  degree  of  error.  The  probability  that  they  pick 
up  or  put  down  an  object  is  modulated  as  a  func¬ 
tion  of  how  many  of  the  same  objects  they  have 
met  in  the  recent  past.  This  generates  a  positive 
feed-back  that  is  suificient  to  coordinate  the  robots' 
activity,  resulting  in  their  sorting  the  objects  into 
common  clusters.  While  less  efficient  than  a  hier¬ 
archically  controlled  sorting,  this  decentralised  or¬ 
ganisation  offers  the  advantages  of  simplicity, 
flexibility  and  robustness. 

1.  Introduction 

What  is  the  common  point  between  a  shopkeeper  and  an  ant 
colony?  Each  of  these  organisms  is  able  to  sort  similar  but 
different  objects.  When  one  examines  an  ant  nest  it  is  clear 
that  neither  the  woricers,  the  brood  nor  the  food  are  ran¬ 
domly  distributed.  For  example  the  eggs  are  arranged  in  a 
pile  next  to  a  pile  of  larvae  and  a  further  pile  of  cocoons,  or 
else  the  inree  categories  are  placed  in  entirely  different  parts 
of  the  nesL  The  same  is  true  in  a  shop.  There  is,  however,  an 
essential  difference.  The  shopkeeper  decides  where  he  is 
going  to  put  his  different  goods,  and  if  he  has  assistants  he 


tells  them  where  to  place  what.  Ants  work  in  parallel  but  do 
not,  as  far  as  we  can  tell,  have  the  capacity  to  communicate 
like  the  shopkeeper,  nor  do  they  have  a  hierarchical  organi¬ 
sation  whereby  one  individual  makes  the  necessary  deci¬ 
sions  and  the  others  follow.  Nevertheless,  if  you  tip  the 
contents  of  a  nest  out  onto  a  surface,  very  rapidly  the  work¬ 
ers  will  gather  the  brood  into  a  place  of  shelter  and  then  sort 
it  into  different  piles  as  before. 

This  article  describes  a  simple  behavioural  algo¬ 
rithm,  to  be  followed  by  each  worker,  that  generates  a  sort¬ 
ing  process.  Sorting  is  achieved  without  requiring  either 
external  heterogeneities  (e.g.  temperature  or  humidity),  hi¬ 
erarchical  decision-making,  communication  between  the  in¬ 
dividuals  or  any  global  representation  of  the  environment. 
We  also  stress  that  the  antsAvbots  have  only  very  local  in¬ 
formation  about  the  environment  and  a  very  short-term 
memory,  and  furthermore  move  randomly,  no  oriented 
movement  being  necessary.  They  can't  see  far  off  nor  move 
directly  towards  objects  or  piles  of  objects. 

Our  aim  in  this  article  is  not  to  prove  that  the 
model  proposed  is  actually  how  the  ants  behave,  but  to 
show  that  such  an  algorithm  both  works  and  could  be  used 
by  a  team  of  robots.  Inspired  from  our  knowldege  of  the 
importance  of  functioral  self-organisation  or  distributed  in¬ 
telligence  in  ant  colonies  (Deneubourg,  1977;  Deneubourg 
et  al.,  1984, 1986,  1987;  Deneubourg  and  Goss,  1989;  Goss 
et  al.,  1990;  Aron  et  al.,  1990),  our  idea  presents  a  working 
illustration  of  how  such  a  distributed  system  can  have  prac¬ 
tical  applications  in  robotics,  in  accordance  with  ideas  de¬ 
veloped  by  ourselves  (e.g.  Deneubourg  et  al..  1984,  1990; 
Deneubourg  and  Goss.  1989).  and  others  (e  g.  Seni,  1988; 
Brooks  and  Flynn,  1989;  Sandini  and  Dario,  1989;  Fukuda 
and  Kawauchi.  1989;  Brooks  et  al.,  1990;  Steels.  1990).  The 
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Abstract 

During  the  summer  of  1989  we  established 
the  Intelligent  Sensing  and  Control  Laboratory 
in  the  Department  of  Artificial  Intelligence  at 
Edinburgh.  This  laboratory  is  designed  to 
support  both  post-graduate  teaching  and  basic 
research  into  intelligent  sensing  and  control. 
In  this  paper  we  present  the  motives  for 
setting  up  the  Intelligent  Sensing  and  Control 
Laboratory  and  the  design  and  implementation 
of  a  Lego  Technic^*^  based  technology  used 
to  build  simple  autonomous  vehicles  intended 
to  support  the  teaching  and  research  activities 
of  the  laboratory.  We  report  on  some  of 
the  experiences  gained  from  its  first  year  of 
operation  and  relate  these  to  the  requirements 
of  our  biologically  oriented  research  programme 
into  intelligent  behaviour  and  its  development  in 
autonomous  artificial  mobile  systems. 


1  Introduction 

Du.  g  the  summer  of  1989  we  established  the 
Intelligent  Sensing  and  Control  laboratory  (ISC  lab)  in 
the  Department  of  Artificial  Intelligence  at  Edinburgh. 
This  laboratory  is  designed  to  support  both  post¬ 
graduate  teaching  and  basic  research  into  intelligent 

f  Name*  appear  in  alphabetical  order,  with  both  being  principal 
authors  on  this  occasion. 


sensing  and  control.  In  the  autumn  term  (October  to 
December,  1989)  it  was  used  for  the  practical  sessions 
of  a  new  ten  week  module  called  “Intelligent  Sensing 
and  Control”^  during  which  students  worked  in  pairs  to 
build  a  series  of  Braitenberg-like  ([Braitenberg  1986)) 
mobile  vehicles  of  increasing  motor-sensory  sophistica¬ 
tion.  It  is  now  being  used  for  a  number  of  other  projects 
in  a  long  term  research  programme  to  investigate 
the  acquisition,  maintenance,  and  adaptation  of  task 
achieving  competences  in  autonomous  mobile  robots. 

In  this  paper  we  present  the  motivation  for  setting 
up  the  ISC  lab  and  the  design  and  implementation  of 
the  technology  used  to  build  the  autonomous  vehicles. 
We  also  report  on  some  of  the  experiences  gained  from 
its  first  year  of  operation  in  both  teaching  and  research. 


2  Background  and  Motivation 

There  are  two  different  motives  for  setting  up  the 
Intelligent  Senisng  and  Control  Laboratory  within  the 
Artificial  Intelligence  Department  at  Edinburgh.  The 
first  concerns  the  nature  of  intelligent  behaviour  and 
the  way  we  want  to  investigate  it.  The  second  concerns 
the  kind  of  sensing  and  control  we  want  to  teach  to  our 
intelligent  robotics  students. 


‘This  forms  p&rt  of  the  Intelligent  Robotics  theme  of  the 
Department's  MSc  in  Knowledge-based  Systems. 
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Abstract 

In  this  paper,  we  propose  the  use  of  the  animat  approach  to 
automatically  generate  animation  scripts  for  computer  synthe¬ 
sized  movies.  We  want  to  design  a  system  where  animats  are 
actots  able  to  improvise  from  a  few  contexmal  infoimations. 
We  plan  to  implement  such  animats  with  extended  classifiers 
allowing  a  compact  encoding  of  behavior  rules  while  preserv* 
ing  their  ability  to  be  modified  by  inductive  genetic  opera¬ 
tions. 


1  Introduction 

When  computer  animated  characters  become  more  and  more 
realistic  in  their  rendering,  the  problem  of  specifying  more  re¬ 
alistic  individual  and  collective  behaviors  also  s^pears.  We 
think  that  animats  can  be  used  to  solve  this  new  problem  and 
that  behavior  simolation  based  systems  can  change  the  way 
computer  animarioas  are  designed  We  propose  a  “program¬ 
ming  by  environment’*  approach  using  animats  and  evolution 
simulation  which  can  drastically  reduce  the  work  of  animation 
script  writing. 

Animats  are  computer  simulated  eniiiies.  exhibiting 
animal-like  autonomous  individual  or  collective  behavior.  We 
want  to  use  them  as  low-cost  credible  crowd  artists  in  order 
to  ease  the  writing  of  animation  scripts.  In  the  world  of  film 
making,  crowd  artists  are  employed  to  give  the  audience  back¬ 
ground  informations  about  the  time  and  location  where  the 
action  of  the  movie  takes  place.  The  movie  director  gives  to 
these  actors  fuzzy  indications  on  how  tr  behave,  then  they  are 
left  neatly  without  control  during  the  filming.  On  the  other 
hand,  as  the  central  character  behaviors  may  not  easily  infered 
from  the  movie  context,  the  corresponding  actors  are  more 
precisely  directed  and  have  less  freedom  than  crowd  artists. 


2  Animation  automation 

Consider  traditional  animation  movies:  drawn  by  hand,  they 
require  a  frame-by-frame  precision  level  script  for  each  char¬ 
acter  of  the  animation.  A  first  level  of  automation  consists  in 
drawing  a  few  frame  images  by  hand  and  apply  an  interpola¬ 
tion  procedure  to  generate  the  missing  frames  required  for  a 
smooth  animation.  Improvements  from  the  traditional  paper 
based  technique  only  reside  in  greater  flexibility  of  the  draw¬ 
ing  tools  and  reduction  of  involved  manpower. 

By  using  computers,  we  can  automate  the  generation  of  the 
script  by  writing  procedures  in  a  computer  language.  Object 
oriented  [6]  languages  seem  to  match  the  requirements  of  an¬ 
imation  programming.  In  such  languages,  an  object  is  speci¬ 
fied  by  a  local  state  (a  set  of  state  variables)  and  a  set  of  proce¬ 
dures  (the  object’s  methods).  An  object's  behavior  is  imple¬ 
mented  by  its  methods.  As  its  methods  process  the  object’s 
local  state,  it  is  easy  to  obtain  a  wide  range  of  behaviors  frxnn 
a  few  procedural  specifications,  by  varying  the  content  of  the 
entities  local  states. 

If  object  oriented  programming  systems  facilitate  the  be¬ 
havioral  specification  of  large  groups  of  simulated  actors,  they 
do  not  give  any  assurance  that  the  resulting  acting  will  meet 
the  desired  goals.  This  work  is  still  under  the  responsibility  of 
the  designer  who  must  keep  in  mind  a  model  of  the  potential 
interactions  between  actors. 

A  few  fixed  procedures  can  simulate  complex  behaviors. 
Even  a  cellular  automaton,  Conway’s  game  of  life  [2],  can 
generate  primitive  animats  -  gliders  and  glider  guns,  for  in¬ 
stance  -  with  interesting  behaviors.  In  the  frame  of  computer 
animated  graphics,  we  are  moreover  helped  by  spectators  w  ho 
will  tend  to  interpret  the  events  occurring  in  the  movie.  This 
fact  is  highlighted  in  V.  Braitenberg’s  book  [3]  where  .sim¬ 
ple  animats  (called  vehicles)  are  involved.  These  animats  are 
carts,  endowed  with  captors  (photosensible  cells)  and  effec¬ 
tors  (propellers  such  as  w’heels),  which  are  wandering  in  an 
environment  containing  light  sources.  By  changing  the  dispo¬ 
sition  and  the  propenies  of  the  different  vehicle  brain  compo- 
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Abstract 

A  model  of  the  mechanisms  underlying 
exploratory  behaviour,  based  on  empirical 
research  and  refined  using  a  computer 
simuladon,  is  presented.  The  behaviour  of 
killifish  from  two  lakes,  one  with  killiflsh 
predators  and  one  without,  was  compared  in 
the  laboratory.  Plotting  average  activity  in  a 
novel  environment  versus  time  resulted  in  an 
inverted-U-shaped  curve  for  both  groups; 
however,  the  curve  for  killifish  from  the  lake 
without  predators  was  (1)  steeper,  (2)  reached 
a  peak  value  earlier,  (3)  reached  a  higher 
peak  value,  and  (4)  subsumed  less  area  than 
the  curve  for  killifish  from  the  lake  with 
predators.  We  hypothesize  that  the  shape  of 
the  exploration  curve  reflects  a  competition 
between  motivational  subsystems  that  excite 
and  inhibit  exploratory  behaviour  in  a  way 
that  is  tuned  to  match  the  alTordance 
probabilities  of  the  animal's  environment.  A 
computer  implementation  of  this  model 
produced  curves  which  differed  along  the 
same  four  dimensions  as  differentiate  the  two 
killifish  curves.  All  four  differences  were 
reproduced  in  the  model  by  tuning  a  single 
parameter  the  time-dependent  component  of 
the  decay-rate  of  the  exploration-inhibiting 
subsystem. 

1.  Introduction 

Selection  tends  to  favour  the  evolution  of 
systems  whose  organization  enables  more  efficient 
ways  of  perceiving  and  interacting  with  the 
environment,  and  greater  capacity  to  cope  flexibly 
with  environmental  change.  This  often  entails 
progressive  difTcrentiation  of  the  parts  of  a  system 
into  subsystems  that  are  specialized  to  take  care  of 


different  aspects  of  survival.  For  example,  the  parts 
of  an  animal  involved  in  the  detecdon  and 
avoidance  of  predators  can  be  thought  of  as 
comprising  one  subsystem,  the  parts  involved  in 
obtaining  and  metabolizing  energy  another,  and  so 
on. 

Thus,  an  adapdve  system  can  be  thought  of  as  a 
set  of  specialized  subsystems  working  together  to 
maintain  the  integrity  of  the  whole.  However,  when 
we  divide  the  system  into  subsystems  according  to 
uidmate  causal  goals  such  as  avoiding  predators  and 
obtaining  energy,  every  subsystem  that  relies  on 
behaviour  includes  skeletomusculature.  Much  of 
the  skeletomusculature  plays  a  role  in  the 
funedoning  of  inany  subsystems.  This  makes  sense; 
subsystem-specific  limbs  are  redundant,  to  the 
extent  that  (1)  subsystems  require  only  periodic 
control  of  skeletomusculature  to  fiincdon 
cffecdvely,  and  (2)  the  skeletomusculature  required 
to  meet  the  needs  of  one  subsystem  could  also  be 
used  to  meet  the  needs  of  another.  The  upshot: 
skeletomusculature  is  shared  amongst  subsystems, 
and  though  subsystems  have  to  work  together 
cooperadvely,  they  must  also  compete  for  control  over 
what  McFarland  and  Sibly  (1975)  refer  to  as  the 
system’s  “behavioural  final  common  path”  (see  for 
example.  Miller,  1971;  Baerends  and  Drent,  1982; 
Colgan,  1989).  Which  subsystem  wins  the 
compedtion  depends  upon  the  reladve  need 
(deviadon  from  homeostasis)  of  the  subsystems,  the 
opportunides  and  dangers  currendy  afforded  by  its 
environment,  and  the  pros  and  cons  of  engaging  in 
behaviour  that  has  only  indirect  or  long-term 
effects.  Dawkins  (1976)  has  suggested  that 
subsystem  competition  is  lessened  somewhat  by  the 
fact  that  the  behaviours  associated  with  various 
subsystems  occupy  different  posidons  on  an 
established  behavioural  hierarchy;  given  equal  need 
to  control  behaviour,  the  behaviour  that  is  higher 
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Abstract 


This  paper  presents  a  design  for  a  sensory-motor  interface  to  generate  motivated  behaviour 
and  some  capacity  to  learn,  which  may  be  useful  in  producing  task-oriented  autonomous 
robots.  There  is  a  single  design  for  the  basic  sensory-motor  interface.  By  varying  sensor  bias, 
parameter  settings  and  type  of  sensory  input  one  can  generate  many  motivational  phenomena 
observed  in  lower  vertebrates.  Chains  of  instinctive  behaviours  can  be  refined  by  the  experi¬ 
ence  of  whether  a  response  to  a  particular  stimulus  is  successful.  The  property  of  incentive 
motivation  can  be  mimicked,  as  well  as  learning  by  punishment  and  drive-reduction  learning. 
Throughout  the  discussion  we  attempt  to  point  out  some  of  the  difficulties  which  might  be  en¬ 
countered  in  designing  bio-mimic  autonomous  robots. 


There  are  three  broad  classes  of  behaviour  which 
are  generally  referred  to  as  motivated.  In  the  first 
class,  motivation  is  based  on  the  reduction  of  phy¬ 
siological  deficits.  Hungers  for  calories,  proteins 
and  minerals  are  the  obvious  examples  of  these 
"drive-reduction*  motivations.  No  deficit- 
correction  is  involved  in  the  second  category, 
where  the  motivations  are  'incentive”;  however 
these  incentive  motivations  involve  stimuli  which 
are  inherently  hedonic  -  either  attractive  or  aver¬ 
sive.  An  example  is  the  motivation  to  consume 
tasty  non-caloric  sugar  substitutes  like  saccharin, 
which  can  reward  behaviour  quite  effectively  in 
hungry  animals.  Finally,  there  is  motivation  to  per¬ 
form  sequences  of  displays  or  'fixed  action  pat¬ 
terns*  with  neither  physiological  deficit-correction 
nor  obvious  hedonic  responses  involved  •  for 
example  the  motivation  of  territorial  animals  to 
fight  intruders. 

This  paper  presents  a  model  which  was  dev¬ 
ised  to  simulate  fighting  behaviour  in  Siamese 
fighting  fish  (the  last,  least  obvious  class  of  motiva¬ 
tion),  and  which  turned  out  to  model  the  other 
types  of  motivation  as  well.  This  model  is  so  sim¬ 
ple  and  explicit  that  it  could  be  used  as  a  design 
principle  for  building  sensory-motor  interfaces  to 
control  a  range  of  motivated  behaviours  in  auto¬ 
nomous  robots. 

The  model  has  two  basic  elements:  (1)  a  cir¬ 
cuit  diagram  showing  'neurons*  connected  by 


'synapses',  some  of  which  have  variable  strength 
(fig.  1),  and  (2)  a  connection  strength  change  rule 
related  to  Hebb’s  rule,  which  controls  the  strengths 
of  the  synapses  as  a  function  of  activity  in  the  con¬ 
nected  neurons  (fig.  2). 

We  will  describe  the  model  in  terms  of  an 
example  rather  than  in  abstracto,  although  a  brief 
mathematical  description  is  included  for  complete¬ 
ness  at  the  end  of  the  paper  (fig.  4).  A  detailed 
mathematical  model  has  been  constructed  (Haipc- 
rin  and  Halperin,  in  prep.)  and  it  forms  the  basis  of 
a  successful  computer  simulation  (Halperin,  Halpc- 
rin,  Rutherford  and  Dunham,  in  prep.).  Since  the 
fighting  fish  social  behaviour  which  led  to  the 
development  of  the  model  would  be  difficult  to 
visualize  for  anyone  but  ethologists,  we  will  instead 
present  the  entirely  hypothetical  example  of  a 
scrap-collecting  robot  1110  goal  is  to  illustrate  the 
principles  and  also  to  illustrate  the  problems  which 
would  have  to  be  dealt  with  when  using  these  ideas 
in  a  practical  design  context  Many  of  these  prob¬ 
lem  are  of  the  type  which  have  been  dealt  with  so 
successfully  in  the  biological  mimic  robots  built  at 
MIT  (  c.g.  Brooks  and  Flynn,  1989),  and  the  suc¬ 
cess  of  these  robots  creates  optimism  that  exten¬ 
sions  to  motivated  bio-mimic  robots  can  succeed. 

The  hypothetical  example  we  will  consider  is 
a  litter-collecting  system  for  a  park.  There  is  a 
large,  heavy  collecting  machine  (or  an  expensive 
human  to  do  the  collecting)  and  a  small  army  of 
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Abstract 

ThU  paper  draws  an  analogy  between  the  way 
insects  use  vision  to  move  tKtnuelue*  with  respect  to 
local  landmarks,  and  the  problem  of  moving  object* 
relative  to  each  other  in  vision>gaided  robotic  assem¬ 
bly.  In  particular,  an  algorithm  is  presented  for  at¬ 
tracting  similar  shapes  together  which  was  directly 
inspired  by  a  model  of  navigation  in  honeybees,  and 
which  shares  the  same  characteristics  of  robustness 
and  immunity  to  noise.  In  particular,  the  algorithm 
can  rotate,  translate  and  scale  one  2D  shape  to  align 
it  with  another  despite  the  presence  of  significant  dis¬ 
tortion  and  missing  or  extraneous  features. 

1  Introduction 

There  are  several  reports^  that  insects  are  able  to  store 
retinal  images  and  to  later  coir.^  ^re  these  stored  images 
with  a  current  one  in  order  to  facilitate  homing  with 
respect  to  a  set  of  (proximal)  landmarks.  For  exam¬ 
ple  digger  wasps  can  accurately  fly  to  their  tiny  nest 
entrance  by  using  local  landmarks  such  as  pine  cones 
scattered  up  to  a  metre  or  more  from  the  entrance  itself 
(Tinbergen,  1932).  Similarly,  a  honeybee  can  reliably 
relocate  a  food  source  such  as  a  flower  even  though  it 
is  too  small  to  see  from  a  distance  or  can  only  be  seen 
from  above  due  to  surrounding  vegetation.  Agw,  once 
they  are  in  the  approximate  vicinity  of  the  food  it  has 
been  shown  that  bees  can  home  into  the  exact  location 
by  using  local  landmarks  (Anderson,  1977). 

One  of  the  most  striking  features  of  this  type 
of  homing  mechanism  is  its  robustness  in  the  face 
of  missing,  additional  or  moved  landmarks  between 
visits.  Cartwright  and  Collet  proposed  an  elegant 
computational  model  of  landmark  learning  in  honey¬ 
bees  in  which  the  bee  stores  an  image  (retinal  snap¬ 
shot)  of  its  surroundings  when  it  finds  a  food  source 
(Cartwright  ic  Collett,  1983).  To  re-find  the  this  loca¬ 
tion  the  bee  must  first  find  the  approximate  vicinity  (te 
by  some  other  form  of  navigation)  so  that  the  most  of  the 
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relevant  landmarks  are  visible.  Given  these  starting  con¬ 
ditions,  the  authors  show  how  comparison  of  the  snap¬ 
shot  and  a  current  image  can  lead  to  the  computation 
of  a  flight  vector  which,  if  followed,  tends  to  bring  the 
bee  into  a  position  where  snapshot  and  new  image  are 
more  similar.  Thus  as  the  (simulated)  bee  flies  around 
it  continually  compares  its  snapshot  with  the  image  of 
its  current  surroundings  and  computes  new  flight  vector 
and  so  on,  thus  bringing  it  ever  closer  t?  the  ‘memorised’ 
location. 

The  main  reason  for  the  robustness  of  Cartwright  and 
Collet’s  algorithm  is  that  it  does  not  need  to  solve  the 
correspondence  problem  of  which  features  in  the  image 
match  which  features  in  the  snapshot.  Computation  of 
the  flight  vector  is  independent  of  the  goodness  of  fit  be¬ 
tween  image  and  snapshot  and  so  no  search  is  involved. 
The  effect  of  the  computation  is  that  the  bee  moves  to¬ 
wards  things  (or  gaps  between  things)  which  appear  too 
small  and  away  from  things  which  appear  too  big. 

For  simplicity,  it  is  assumed  that  image  and  snapshot 
each  consist  of  a  360*  wide  ring  of  dark  and  light  seg¬ 
ments.  Each  light  or  dark  region  in  the  image  is  paired 
with  the  nearest  region  of  the  same  sign  in  the  snapshot 
and  the  angular  difference  is  found.  These  differences 
are  simply  averaged  to  find  an  overall  rotation  for  the 
bee.  IVanslational  vectors  are  computed  by  comparing 
the  angular  sizes  of  dark  and  light  areas.  For  example 
a  dark  area  in  the  image  is  compared  with  the  nearest 
dark  area  in  the  snapshot;  if  the  image  area  is  smaller 
then  this  suggest  moving  radially  towards  that  part  of 
the  image.  As  with  rotation,  radial  vectors  generated 
in  this  way  are  simply  averaged,  giving  an  overall  trans¬ 
lation  vector.  Even  if  individual  pairings  of  edges  are 
wrong  in  terms  of  the  correspondence,  the  overall  flight 
vector  tends  to  be  roughly  right  and  small  errors  are 
compensated  for  by  the  iterative  nature  of  the  process. 

1.1  A  robot  guidance  problem 

It  was  decided  to  take  a  similar  approach  to  the 
bee  model  to  help  solve  an  analogous  problem  which 
arises  when  guiding  a  hand-held  object  to  bring  it 
in  line  with  a  stationary  object  under  camera  guid- 
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Abstract 

Darwinian  evolution  produces  intelligent  be¬ 
haviour  without  a  designer,  and  this  can  be  used 
to  evolve  behaviour  in  simulated  organisms.  The 
problems  associated  with  using  genetic  algorithms 
to  evolve  programs  in  a  conventional  language, 
and  to  evolve  the  architecture  of  connectionist 
networks,  are  compared.  For  an  open-ended  space 
of  network  architectures,  a  developmental  lan¬ 
guage  to  model  the  production  of  a  network  from 
the  genotype  is  necessary,  as  is  a  new  theoreti¬ 
cal  analysis  of  genetic  algorithms  not  limited  to 
genotypes  of  a  fixed  length. 

1  Introduction 

Artificial  Intelligence  arose  as  a  field  of  study  from  the 
belief  that  intelligent  human  behaviour  could  be  for¬ 
malised,  and  hence  could  be  mechanised.  Problem  solv¬ 
ing  must  be  done  according  to  rules,  so  this  approach 
went;  put  the  rules  in  a  machine,  and  we  will  have  an 
intelligent  machine.  The  greatest  successes  have  been, 
understandably,  in  just  those  fields  of  human  intelligence 
where  the  problems  can  be  formally  defined,  e.g.  chess¬ 
playing,  expert  systems  in  simple  domains;  even  liere, 
success  has  been  limited.  Connectionism  (McClelland 
and  Rumelhart  1987)  uses  a  different  approach;  investi¬ 
gating  network  models  which  are  based  on  a  very  simpli¬ 
fied  model  of  the  brain,  namely  large  numbers  of  simple 
processing  nodes  with  many  wires  connecting  them,  pass¬ 
ing  activations  throughout  the  network.  A  major  insight 
to  come  from  this  approach  is  that  the  behaviour  of  the 
whole  can  look  as  though  it  is  obeying  e.xplicitly  pro¬ 
grammed  rules,  even  though  one  can  see  that  this  is  just 
an  emergent  property  of  the  underlying  mechanisms. 

The  conventional  A.I.  approach  tries  to  design  into  a 
program  intelligent  behaviour;  the  connectionist  tries  to 
design  networks  that  will  produce  intelligent  or  adaptive 
behaviour.  Yet  Darwinian  evolution  shows  us  that  there 
can  be  intelligent  behaviour  without  a  designer. 

Hence  as  one  can  consider  the  intelligence  and  adapt¬ 
ability  of  humans  and  other  animals  to  be  an  emergent 
property  of  their  evolutionary  history  in  their  environ¬ 
ment,  one  can  also  consider  the  possibility  of  the  emer¬ 
gence  of  intelligent  and  adaptive  behaviour  in  simulated 


organisms,  in  simulated  environments,  with  some  form 
of  evolutionary  algorithm. 

Approaches  on  these  lines  have  been  the  subject  of 
much  recent  interest,  and  conferences  on  similar  themes 
have  been  held  under  the  title  of ‘Artificial  Life’  in  Santa 
Fe in  1987  and  1990  (Langton  1989,  Langton  ei  al.  1991), 
and  with  the  titles  ‘Evolution,  Games  and  Learning’  and 
‘Emergent  Computation’  by  the  Center  for  Nonlinear 
Studies,  Los  Alamos,  in  1985  and  1989  (Farmer  et  al. 
1986,  Forrest  1990). 

This  paper  begins  with  an  introduction  to  the  most  de¬ 
veloped  work  on  evolutionary  algorithms.  Genetic  Algo¬ 
rithms  (GAs).  There  follows  a  discussion  of  the  problems 
involved  in  using  an  evolutionary  approach  to  develop¬ 
ing  conventionally  structured  programs,  such  as  might 
be  required  by  a  conventional,  symbol-processing,  A.I. 
approach  to  simulating  behaviour;  a  program  evolution 
system  is  described  that  has  a  novel  evaluation  function 
which  sidesteps  one  of  these  problems,  at  the  expense  of 
others.  Classifier  systems  are  mentioned,  leading  to  a 
discussion  of  recent  work  on  applying  evolutionary  tech¬ 
niques  to  developing  connectionist  networks  with  archi¬ 
tectures  appropriate  for  the  cognitive  task  they  are  fac¬ 
ing.  Finally,  the  necessity  is  stressed  for  a  new  theoretical 
backing  to  GAs,  and  the  underlying  pitfalls  of  evolving 
computational  systems  are  looked  at. 


GAs  were  developed  initially  by  John  Holland  in  the 
1960’s  (Holland  1975)  as  a  form  of  search  technique  mod¬ 
eled  on  Darwinian  evolution.  The  most  accessible  intro¬ 
duction  is  by  Goldberg  (1989);  other  sources  are  Davis 
( 1987),  and  the  Proceedings  of  the  first  three  G A  con¬ 
ferences  (Grefenstette  1985,  Grefenstette  1987,  Schaffer 
1989). 

Given  a  search  problem,  with  a  multi-dimensional 
space  of  possible  solutions,  a  ‘genetic-code’  representa¬ 
tion  is  chosen  such  that  each  point  in  the  search  space 
is  represented  by  a  string  of  symbols,  the  genotype.  A 
number  of  initial  random  genotypes  are  produced,  typi¬ 
cally  by  a  random  number  generator,  which  form  the  ini¬ 
tial  population.  Each  of  the  corresponding  points  in  the 
search  space  (which  can  be  considered  as  representing 
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Abstract 

Evolution  if  moct  often  viewed  («nd  formflifsd)  as  an  optimisation 
process.  In  this  paper  we  suggest  that  a  useful  alternative  heuristic 
may  6e  to  view  (and  fonnalise)  evolution  as  a  self-erdiancing 
patteni  trmsformation,  pattern  detection  and  pattern  genaaiion 
process.  This  suggestion  is  based  on  the  growing  awareness  of  the 
emergence  of  complex  behaviour  from  simple  envirorunentally 
dependent  action  rules  when  they  operate  in  a  structured 
environment,  and  the  structuring  of  the  environment  by  such 
bdiavionral  pattenis.  We  describe  paradigm  worlds,  which  suggest 
that  such  emergent  behaviour  may  underlie  behaviour  patterns 
observed  in  various  species.  We  show  how  the  emergence  of 
macro-behaviour  patterns  can  be  interpreted  as  a  form  of  pattern 
proceuing  by  the  action  rules  on  the  envirotunent  We  suggest  that 
these  emergent  patterns  function  u  a  prepattem  for  evolutionary 
processes:  evolution  fuates  attd  enhances  these  patterns. 

l.Introduction 

Ever  since  Darwin's  profound  insight  in  equating  survival' 
and  'fitness'  for  self-replicating  entitles,  evolutionary  theory 
has  had  a  strong  footing  in  terms  of  optimisation  processes. 
This  t^Umisation  viewpoint  pervades  most  biological 
thinking  about  evolution  and  adaptation.  Population 
genetics  is  entirely  formulated  in  such  terms,  and  uaits  of 
organisms  are  customarily  'explained'  in  terms  of  their 
'fitness',  recently  in  particular  in  sociobiology  and 
behavioural  ecology.  Genetic  Algorithms  (Holland  1976, 
Goldberg  1989)  have  used  basic  'genetic  mechanisms'  for 
solving  general  optimisation  problems.  Only  relatively 
recently  have  quantitative  studies  begun  to  expose  the 
constraints  on  a  ‘mutation  selection'  process  leading  to 
appreciable  optimisation  (after  ail:  'optimise..  by 
'survival  of  the  fittest'  is  not  a  tautology).  Eiget,  nd 
Shuster  (1979)  exposed  the  'error  threshold',  i.e.  Uiey 
showed  that  only  a  limited  amount  of  mutation  is 
compatible  with  evolutionary  optimisation.  Kauffman 
(Kauffman  A  Smith  1986  Kauffman  1989a.b)  sucssed  that 
optimisation  is  only  possible  in  not  too  rugged  fitness 
landscapes,  i.e  only  if  similar  genotypes  have.m  general, 
similar  lilnesses.  Rugged  fitness  landscapes  result  from 
extensive  coupling  between  genes,  by  which  the  system 
exhibits  strong  selforganising  properties  Such 
selforganisation  thus  seems  to  be  a  constraint  on 
evolutionary  optimisation. 

All  these  approxhes  use  an  external,  apnon,  user'  imposed 
fitness  criterion.  Only  a  few  models  arc  studied,  m  which 
only  survival  determines  the  evolutionary  process  in  a 


coevolutionary  context  (Conrad  &  Rizki  1989,  Packard 
1988,  Holland  1990,  Kauffman  1989b).  In  such  models 
'fitness'  is  not  clearly  defined,  and  fitness  landscapes  are 
wildly  dynamic  entities,  if  they  can  be  visualised  at  all. 

In  this  paper  we  propose  that  it  may  be  worthwhile  to  view 
evolutionary  processes  not  as  primarily  an  optimisation 
process,  but  instead  as  pattern  processing  (i.e.  as  pattern 
detection,  pattern  transformation  or  pattern  generation). 
Such  a  view  is  of  course  entirely  compatible  with  the 
optimisation  viewpoint,  but  provides  different  heuristics 
for  studying  such  processes.  In  particular  we  should  like  to 
study  what  may  be  called  the  generation  of  'fitness' 
dimensions,  rather  than  walks  through  fixed  fitness 
landscapes,  although  we  prefer  a  terminology  like  'pauem  of 
survival'  rather  than  fitness.  The  insight  that  pattern 
recognition  and  pattern  detection  can  be  studied  in  terms  of 
energy  minimisation  foptimisation'  as  used  in  evolutionary 
theory)  (Hopficid  1984.  Ackley  et  al.  198S  and  the 
extensive  litcrauire  which  follow^  this  conceptualisauon) 
has  led  to  important  new  models  and  machines  for  pattern 
recognition/  pattern  detection.  We  hope  that  a  similar,  but 
reversed,  change  of  viewpoint  with  reqxa  to  evolution  (i.e. 
from  optimisation  to  pauem  detection)  wtllTikewise  lead  to 
new  models  elucidating  (and  possibly  machines, 
exhibiting)  innovative  evolutionary  ad^ptaiion. 

2.  TODO  and  emergent  behaviour 

2.1  Introduction  to  the  TODO  principle. 

The  potential  of  local  rules  to  generate  complex  behaviour 
in  interaction  with  a  structured  environment  was  first  hinted 
at  by  Simon  (1969)  in  his  phrase:  'an  Ant  viewed  as  a 
behaving  system  is  quite  simple,  the  apparent  complexity  of 
its  behaviour  in  time  is  largely  a  reflexion  of  the 
complexity  of  the  environment  in  which  it  finds  itself...": 
"a  Man  viewed  as  a  behaving  system  is  quite  simple,  the 
apparent  complexity  of  his  b^viour  in  time  is  largely  a 
reflexion  of  the  complexity  of  the  environment  in  which 
be  finds  himself...*.  Simon,  and  with  him  most  Artificial 
Intelligence  research,  have  concenuaied  endiely  on  humans, 
and  have  in  practice  dismissed  this  phrase  as  an  irony.  Nor 
have  those  studying  animal  behaviour  taken-  the  hint 
seriously,  they  have  continued  to  study  behaviour  virtually 
independent  of  the  environment  or  they  have  paid  attention 
to  the  environment  as  a  constraint  on  (optimising) 
behaviour  only.  By  contrast  our  own  research  has  been  in 
the  direction  of  Simon's  pointer,  but  has  gone  beyond  it  by 
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Abstract 

This  paper  reviews  some  issues 
concerning  the  use  of  optimality 
principles  in  the  context  of  animal 
behaviour.  After  a  brief  discussion  of 
constraints,  I  look  at  matching  and 
maximizing  as  accounts  of  behaviour.  As 
long  as  maximization  is  used 
descriptively  there  is  no  conflict 
between  these  approaches.  They  are 
alternative  forms  of  description;  in 
general  it  Is  possible  to  convert  one 
form  of  desaiptlon  into  the  other. 
Melioration  has  been  suggested  as  a 
priciple  that  underiles  matching.  When 
faced  with  two  alternatives  melioration 
requires  that  the  animal  inaeases  its 
allocation  of  time  to  the  alternative  with 
the  higher  local  rate  of  reinforcement. 
When  the  local  rates  become  equal,  an 
equilibrium  is  attained  at  which 
matching  occurs.  I  present  a  simple 
model  of  melioration  in  which  this 
equilibrium  is  not  necessarily  stable.  It 
is  shown  that  time  allocations  can  be 
periodic  or  chaotic,  and  that  matching 
does  not  necessarily  occur. 

1.  Introduction 

The  use  of  optimality  principles  in  biology 
remains  controversial  (see  for  example  Maynard 
Smith  1978,  Gould  and  Lewontin  1979.  Williams 
1985,  Ollason  1987,  DuprO  1987).  In  this 
paper  I  make  no  attempt  to  review  all  the  issues. 
After  some  remarks  on  the  problem  of 
constraints.  I  concentrate  on  alternatives  to 
optimality  that  have  emerged  in  operant 


psychology,  it  is  argued  that  some  of  the 
confusion  in  this  area  arises  because  of 
ambiguities  concerning  the  status  of  optimality  as 
an  account  of  behaviour.  I  then  consider  a 
dynamic  principle  called  melioration.  This  is  a 
local  optimization  principle  that  may  not  always 
result  in  optimal  behaviour,  i  show  that  under 
some  circumstances  a  form  of  melioration  can 
result  in  chaotic  behaviour. 

2.  Constraints 

Gould  and  Lewontin  (1979)  use  the  central  dome 
of  St  Mark's,  Venice  as  a  starting  point  for  their 
attack  on  optimality.  The  dome  rests  on  four 
pendentives,  which  Gould  and  Lewontin  refer  to 
as  spandrels.  (The  term  'spandrel*  Is  usually 
restricted  to  2'dimensional  surfaces  of  the  sort 
shown  in  Figure  1.)  Each  pendentfve  Is  decorated 
with  a  mosaic  of  one  of  the  four  evangelists, 
together  with  his  associated  river,  (aouid  and 
Lewontin  suggest  that  the  design  could  result  in 
the  view  that  the  architecture  is  secondary  to  it. 
They  contrast  this  with  the  prope'  path  of 
analysis’s  that  'begins  with  an  architectural 
constraint:  the  necessary  four  spandrels  and  their 
tapering  triangular  form*.  (p582) 

it  is  important  to  realise  that  this  is  not  an 
argument  against  an  optimality  analysis,  but 
merely  an  illustration  of  the  inqwrtance  of  asking 
the  right  question.  As  Gould  and  Lewontin 
(1979)  state,  the  mosaics  clearly  serve  the 
function  of  expressing  the  Christian  faith.  We 
can  look  at  the  decoration  as  a  whole  in  this  lighL 
and  we  can  compare  the  decoration  of  restricted 
triangular  spaces  in  various  domed  churches.  The 
architecture  specifies  the  constraints  within 
which  an  'optimality*  analysis  of  the  decoration 
operates. 
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Abstract 

fframplea  of  simulation  models  applied  to  the  long 
distance  orientation  of  rodents,  bees,  salmon  and 
pigeons  are  reviewed.  In  each  case  the  models 
provide  new  insights  which  are  contradictory  to  the 
current  map  hypothesis.  Salmon  and  rodent 
performances  are  assumed  to  be  consistent  with  a 
random  process.  Bees  supposedly  use  non  cognitive 
processes  to  perform  map-like  orientation 
procedures,  and  pigeons*  homing  is  found  to 
involve  mainly  stochastic  processes.  The  efficiency 
of  the  rifflulated  models  as  well  as  the  fact  that  they 
point  to  hitherto  underestimated  aspects  of 
orientation  make  them  useful  new  tools  for  solving 
spatial  problems. 

1.  Introduction 

Authors  using  the  theoretical  approach  to  the  long 
distance  orientation  in  animals  have  consistently  neglected 
the  development  of  quantitative  models.  Consequently, 
the  properties  of  random  processes  have  been  rarely 
formulated  and  their  possible  contribution  to  observed 
performances  have  been  underestimated  (Jamon,  1987). 
Nor  have  performances  been  attributed  to  any 
distinguishable  orientation  mechanisms.  Indeed,  long 
distance  orientation  has  been  largely  interpreted  in  terms 
of  analogies  with  human  navigation  techniques,  leading  to 
the  eiqilicitly  formulated  idea  that  ’routine  animals’ 
movements  are  governed  by  a  navigational  process  closely 
analogous  to  every  day  marine  practice*  (Gallistel,  1989). 
The  reference  to  man  made  concepts  of  orientation  is 
based  on  the  ’map*  metaphor,  which  implies  that  animals 
can  build  up  a  mental  representation  of  their 
environment  Two  sorts  of  maps  have  been  envisaged. 
When  an  animal  moves  in  a  familiar  area,  it  is  supposed 
to  develop  a  mental  representation  of  the  landmarks 
(Wiltschko  and  Wiltschko,  1987,  talk  about  animals’ 
mental  picture  of  the  spatial  distribution  of  the  factors 
used),  vAich  has  been  variously  called  the  mosaic  map, 
familiar  area  map,  topographical  map  or  cognitive  map. 
When  the  animal  has  to  orient  in  an  unfamiliar  territory, 
it  is  supposed  to  use  some  sort  of  bico-ordinaie  map. 


formed  by  the  intersection  of  at  least  two  gradients  along 
association  with  a  compass,  exactly  as  human  navigators 
do.  In  this  case,  one  talks  about  a  grid  map,  gradient  map 
or  navigational  map.  This  map  and  compass  hypothesis 
was  first  formulated  by  Kramer  (1953),  but  considerable 
difficulties  have  since  been  encountered  in  determining 
the  map. 

Recent  developments  in  the  theoretical  approach  to 
orientation  have  led  to  more  attention  being  paid  to 
quantitative  models,  owing  to  the  development  of 
computers  and  simulation  techniques.  These  have  thrown 
new  light  on  orientation  by  predicting  the  performances  of 
orientation  mechanisms  which  were  considered  o  prion  to 
be  implausible. 

To  show  how  these  new  techniques  can  changp  our 
view  of  animal  orientation,  I  propose  to  review  four 
examples  where  models  have  provided  new  insights 
contradicting  the  current  map  hypothesis.  Each  of  these 
involves  species  suspected  of  being  capable  of  long 
distance  orientation,  the  first  two  are  alternative  models 
proposed  in  situations  vriiere  homing  by  means  of  familiar 
area  map  was  first  envisaged:  they  concern  rodents  and 
bees.  The  last  two  deal  with  situations  where  the  use  of  a 
navigational  map  has  been  hypothesized:  these  concern 
salmon  and  pigeons. 

2.  The  case  of  rodents’  homing  ability. 

Numerous  experiments  have  shown  that  rodents 
transported  a  relatively  long  distance  from  their  home  can 
return  to  their  previous  territory  with  a  probability  of 
success  which  is  higher  than  that  predicted  by  randomness 
(Joslin,  1977)  (fig.  1). 

Various  orientation  mechanisms  have  been  suggested 
to  explain  this  good  successful  homing  performance  level 
The  animals  have  been  said  to  rely  on  some  specific 
orientation  mechanism  such  as  route-based  navigation 
involving  the  use  of  magnetic  cues  (Mather  and  Baker, 
1980, 1981),  or  to  pilot  by  means  of  a  large  familiar  area 
map  (Furrer,  1973).  In  the  latter  case,  they  would  have  to 
build  up  a  topographical  representation  of  a  large 
territory,  extending  beyond  the  limit  of  their  actual  home 
range  during  the  course  of  exploratory  trips. 
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Abstract 

Agent  theory  in  AI  and  related  disciplines  deals 
with  the  structure  and  behaviour  of  autonomous, 
intelligent  systems,  capable  of  adaptive  action  to 
pursue  their  interests.  In  this  paper  it  is  proposed 
that  a  natural  reinterpretation  of  agent-theoretic 
intentional  concepts  like  knowing,  wanting, 
liking,  etc.,  can  be  found  in  process  dynamics. 

This  reinterpretation  of  agent  theory  serves  two 
purposes.  On  the  one  hand  we  gain  a  well 
established  mathematical  theory  which  can  be 
used  as  the  formal  mathematical  interpretation 
(semantics)  of  the  abstract  agent  theop^.  On  the 
other  hand,  since  process  dynamics  is  a  theory 
that  can  alw  be  applied  to  physical  systems  of 
various  kinds,  we  gain  an  implementation  route 
for  the  construction  of  artificial  agents  as  bundles 
of  processes  in  machines.  The  paper  is  inten^ 
as  a  basis  for  dialogue  with  wtvkers  in  dynamics, 

AI,  ethology  and  cognitive  science. 

1  Introduction 

Agent  dieory  is  a  branch  of  artificial  intelligence  (Kiss, 
1988).  Its  domain  is  the  theory,  design  and  implementation 
of  artincial  systems,  similar  to  anin^  or  people,  that  are 
enable  of  autonomous,  rational  actions  through  which  to 
pursue  their  interests  and  goalx  Asi^cts  of  this  theory 
cover,  among  other  things,  how  actions  are  related  to 
knowledge,  how  plans  for  actions  to  reach  goals  can  be 
fc»med,  how  goals  are  formed,  what  the  role  of  intentions 
for  action  is,  how  the  state  of  the  world  is  perceived,  and 
many  others. 

The  abstract  ftxmulation  of  agent  theory  can  be  stated  in 
many  different  languages,  both  informal  and  formal.  Much 
current  work  in  this  field  has  made  use  of  formal  logical 
languages  (Georgeff  and  Lansky,  1986).  Although  these 
q}ecialiiKd  logics  are  convenient  and  expressive,  often  it  is 
dificult  to  formalise  their  semanUcs,  or  the  semantics  that 
have  been  offered  have  undesirable  properties.  An  example 


of  this  is  the  possible-world  semantics  of  epistemic  logics 
which  unfortunately  makes  agents  omniscienL 

The  implementation  of  theories  exi»essed  in  such  formal 
languages  has  additional  problems.  When  agent 
implementation  is  done  by  direct  mechanisation  of  ^ 
logic,  for  example  as  a  theorem-prover,  the  resulting 
systems  turn  out  to  be  Inefficient.  This  is  a  natural 
consequence  of  the  expressiveness  oi  the  language.  On  the 
other  hand,  the  languages  are  sometimes  not  expressive 
enough  to  deal  with  some  concepts  that  seem  needed  to 
describe  agents.  An  example  is  the  expression  of 
quantitative  magnitudes  for  describing  strength  of  belief  in 
an  agent 

Reflnement  of  these  logics  and  their  formal  semantics,  wd 
their  efficient  implementation,  is  of  course  an  ongoing 
enterprise.  This  paper  is  intended  as  an  informal 
preliminary  to  such  woric,  offering  some  intuidons  about 
the  interpretation  of  agent  theory  through  the  general  theory 
of  process  dynamics. 

Such  an  interpretation  can  also  provide  a  strategy  for 
implementation.  The  situation  is  analogous  to  the 
relationship  between  the  abstract  Boolean  algeta  of  classes, 
the  propositional  calculus,  and  hardware  logic  circuits.  The 
abstract  algebra  is  defined  in  terms  of  classes  and  operations 
on  them;  intersection,  union,  complementation,  etc.  One 
interpretation  of  the  Boolean  algebra  is  propositional  logic, 
where  the  variables  range  over  propositions  and  the 
operations  are  truth-functional  manip^tions,  etc.  The 
possibility  of  implementation  arises  from  the  fact  that 
another  interpretation  of  Boolean  algebra  can  be  found  in 
the  operation  of  physical  electrical  circuits.  Because  of 
this,  the  operation  of  the  circuits  can  thus  be  described  by 
propositional  logic,  or  stated  conversely,  the  circuits  ate  an 
implementation  of  the  logic. 

Let  us  represent  this  by  the  following  schema: 

Propositional  logic  ->  Abstract  Boolean  algebra  -> 
Electrical  circuits 
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ABSTRACT 

This  paper  describes  the  recently  developed  "genetic 
progranuning”  paradigm  which  genetically  breeds  popu¬ 
lations  of  computer  programs  to  solve  problems.  In  ge¬ 
netic  programing,  the  individuals  in  the  population 
are  hierarchical  computer  programs  of  various  sizes  and 
shapes.  This  paper  also  extends  the  genetic  program¬ 
ming  paradigm  to  a  "co-evolution"  algorithm  which  op¬ 
erates  simultaneously  on  two  populations  of  indepen¬ 
dently-acting  hierarchical  computer  programs  of  various 
sizes  and  shapes. 

1 .  INTRODUCTION  AND  OVERVIEW 

This  paper  describes  the  recently  developed  "genetic  pro¬ 
gramming"  paradigm  which  genetically  breeds  populations 
of  computer  programs  to  solve  problems.  In  genetic  pro¬ 
gramming,  the  individuals  in  the  population  are  hierarchical 
compositions  of  functions  and  arguments  of  various  sizes 
and  shapes.  Each  of  these  individual  computer  programs  is 
evaluated  for  its  fitness  in  handling  the  problem  environ¬ 
ment  and  a  simulated  evolutionary  process  is  driven  by  this 
measure  of  fiuiess. 

This  paper  also  extends  the  genetic  programming 
paradigm  to  a  "co-evolution"  algorithm  which  operates  si- 
multaneously  on  two  (or  more)  populations  of  indepen¬ 
dently-acting  hierarchical  computer  programs  of  various 
sizes  and  shapes.  In  co-evolution,  each  population  acts  as 
the  environment  for  the  other  population.  In  particular,  each 
individual  of  the  Hrst  population  is  evaluate  for  “relative 
nuiess”  by  testing  it  against  each  individual  in  the  second 
population,  and,  simultaneously,  each  individual  in  the  sec¬ 
ond  popul^on  is  evaluated  for  “relative  Huiess”  by  testing 
it  against  each  individual  in  the  fust  population.  Over  a  pe¬ 
riod  of  many  generations,  individuals  with  high  "absolute 
fiuiess"  tend  to  evolve  as  the  two  populations  mutually 
bootstrap  each  other  to  increasingly  high  levels  of  nuiess. 

In  this  paper,  the  genetic  programming  paradigm  is  il¬ 
lustrated  with  dim  problems.  The  first  problem  involves 
genetically  breeding  a  population  of  computer  programs  to 
allow  an  "artificial  ant"  to  traverse  an  incgular  trail.  The 
second  problem  involves  genetically  breeding  a  minimax 
control  strategy  in  a  differential  game  with  an  independenUy- 
acting  pursuer  and  evader.  The  third  problem  illustrates  the 
"co-evolution"  and  involves  genetically  breeding  an  optimal 
strategy  for  a  player  of  a  simple  discrete  two-person  game 
represented  by  a  game  tree  in  extensive  form. 
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2.  BACKGROUND  ON  GENETIC 
ALGORITHMS 

Genetic  algorithms  are  highly  parallel  mathematical  algo¬ 
rithms  that  transform  populations  of  individual  mathemati¬ 
cal  objects  (typically  fixed-length  binary  character  strings) 
into  new  populations  using  operations  patterned  after  (1) 
natural  genetic  operations  such  as  sexual  recombination 
(crossover)  and  (2)  fitness  proportionate  reproduction 
(Darwinian  survival  of  the  fittest).  Genetic  algorithms  begin 
with  an  initial  population  of  individuals  (typically  randomly 
generated)  and  dim  iteratively  (1)  evaluate  the  in^viduals  in 
the  population  for  fitness  with  respect  to  the  problem  envi¬ 
ronment  and  (2)  perform  genetic  operations  on  various  indi¬ 
viduals  in  the  population  to  produce  a  new  population.  John 
Holland  of  the  University  of  Michigan  present^  the  pioneer¬ 
ing  formulation  of  genetk:  algorithms  for  flxed-len^  char¬ 
acter  strings  in  Adaptation  in  Natural  and  Artifidal  Systems 
(Holland  1975).  Holland  establi^ed,  among  other  things, 
that  the  genetic  algorithm  is  a  mathematically  near  optimal 
approach  to  adaptation  in  that  it  maximizes  expected  overall 
average  payoff  when  the  adaptive  process  is  viewed  as  a 
multi-anned  slot  machine  problem  requiring  an  optimal  al¬ 
location  of  future  trials  given  currently  available  informa¬ 
tion.  Recent  work  in  genetic  algorithms  and  genetic  classi¬ 
fier  systems  can  be  surveyed  in  Goldberg  (1989).  Davis 
(1987).  and  Schaffer  (1989). 

3.  BACKGROUND  ON  GENETIC 
PROGRAMMING  PARADIGM 

Representation  is  a  key  issue  in  genetic  algorithm  work 
because  genetic  algorithms  directly  manipuiaiB  the  coded  rep¬ 
resentation  of  the  problem  and  because  the  representation 
scheme  can  severely  limit  the  window  by  which  the  system 
observes  its  world.  Fixed  length  character  strings  present  dif¬ 
ficulties  for  some  problems  —  particularly  probieins  in  arti¬ 
ficial  intelligence  where  the  desired  solution  is  hierarchical 
and  where  the  size  and  shape  of  the  solution  is  unknown  in 
advance.  The  need  for  more  powerful  representations  has 
been  recognized  for  some  time  (De  Jong  1983. 1988). 

The  structure  of  the  individual  mathematical  objects  that 
are  manipulated  by  the  genetic  algorithm  can  be  more  com¬ 
plex  than  the  fix^  length  character  strings.  Smith  (1980) 
departed  from  the  early  fixed-length  character  strings  by  in¬ 
troducing  variable  length  strings,  including  strings  whose 
elements  were  if-then  rules  (rather  than  single  characters). 
Holland's  introduction  of  the  classifier  system  (1986)  con¬ 
tinued  the  trend  towards  increasing  the  complexity  of  the 
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Abstract 

* 

1  present  a  scheme  for  partitioning 
information  used  in  decision  making. 

Three  types  of  information  are  recog¬ 
nized:  internal  information,  or  an  indi¬ 
vidual’s  internal  state;  external  informa¬ 
tion.  or  environmental  factors;  and 
relational  information,  or  rules  for 
predicting  transformations  of  internal 
state.  A  genetic  simulation  model  is 
described  which  tracks  the  evolution  of 
alleles  for  high  and  low  information 
access  in  each  information  type  in  a 
population  with  density  dependence. 

Stable  polymorphisms  result.  Interac¬ 
tions  between  the  three  genes  are  ex¬ 
plored.  The  relevance  of  the  model  to 
foraging  situations  is  discussed. 

1.  Introduction 

A  major  criticism  of  classical  optimal  foraging 
models  is  that  they  assume  complete  information 
(Stephens  &  Krebs  1986).  The  problem  of  incom¬ 
plete  information  has  received  considerable  atten¬ 
tion  in  the  pa.st  twenty  years,  and  the  basic 
prey  and  patch  models  have  been  expanded  to 
consider  incomplete  information  in  prey  recogni¬ 
tion  (Houston  et  al.  1980,  Getty  &  Krebs  1985), 
patch  sampling  (McNamara  1982,  Lima  1984, 
Bernstein  et  al.  1988),  and  tracking  a  changing 
environment  (Stephens  1987,  Shettleworth  1988). 
In  each  of  these  ca.ses  the  information  studied  is 
’’about"  the  environment;  that  is,  animals  sense 
the  states  of  pertinent  environmental  parameters. 

Is  environmental  information  the  only  sort  of 
information  an  animal  needs  to  make  its  deci¬ 
sions?  How  might  one  partition  information  in 
a  way  that  is  useful  in  thinking  about  the 
evolution  of  behavior? 

Let  us  begin  with  a  simplified  look  at  the 
process  of  survival  and  reproduction.  Any  indi¬ 
vidual  in  a  population  can  be  described  at  a 


given  time  by  an  internal  state  which  largely 
determines  its  current  reproductive  potential.  That 
state  will  be  a  complex  of  many  factors,  includ¬ 
ing  stored  energy,  health,  fertility,  and  (in  a 
sexual  population)  attractiveness  to  mates.  The 
lifetime  of  any  individual  can  be  thought  of 
as  a  time  series  of  internal  state  transformations. 
Transformation  will  be  influenced  by  the 
individual’s  own  behavior  and  by  factors  in  the 
environment  which  influence  that  individual,  and 
will  be  governed  by  a  set  of  rules  which  we 
may  collectively  describe  as  a  transforming 
function. 

From  this  basic  scheme,  there  appear-  three 
types  of  information  to  which  an  individual 
making  decisions  may  have  access.  Fir^ 
the  individual  may  be  aware  of  its  own 
sute.  Call  this  internal  information.  Second, 
it  may  be  aware  of  environmental  conditions,  or 
external  information.  Third,  it  may  be  aware  of 
the  form  of  the  transforming  function,  or 
relational  information. 

.Now  consider  foraging  models  with  these 
three  information  types  in  mind.  The  classical 
models  assume  complete  external  and  relational 
information;  but  since  they  are  static,  or  state- 
independent,  ignore  internal  state.  Their  refine¬ 
ments  deal  with  deficiencies  in  external  informa¬ 
tion  only.  Dynamic  optimization  (Mangel  & 
Clark  1988)  considers  state-dependent  decision 
making;  but  it  assumes  complete  internal  infor¬ 
mation.  Finally,  studies  of  rules  of  thumb 
(Janetos  &.  Cole  1981,  Green  1984)  implicitly 
consider .  reductions  in  relational  information, 
mainly  as  satisficing  strategies  (Simon  1956) 
due  to  functional  constraints,  but  assume  com¬ 
plete  internal  and  external  information. 

This  paper  presents  a  model  that  examines 
the  interaction  of  internal,  external  and  relational 
information  genes  in  an  evolving  population.  I 
chose  simulation  methods  to  model  this  situation 
for  two  reasons.  First,  including  enough  processes 
to  generate  information  of  each  type  makes  the 
model  sufficiently  complex  to  prohibit  an  easy 
analytical  analysis.  Second,  I  was  interested  in 
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Abstract 

The  purpose  of  this  work  is  to  investigate  and 
evaluate  different  reinforcement  learning 
frameworks  using  connectionist  networks.  I  study 
four  frameworks,  which  are  adopted  from  the  ideas 
developed  in  [Barto,  Sutton  &  Watkins, 
1989;  Watkins,  1989;  Sutton,  1990].  The  four 
frameworits  are  based  on  two  learning  procedures: 
the  Temporal  Difference  methods  for  solving  the 
credit  assignment  problem,  and  the  backpropagation 
algorithm  for  developing  appropriate  internal 
representations.  Two  of  them  al^  involve  learning  a 
world  model  and  using  it  to  speed  learning.  To 
evaluate  their  performance,  I  design  a  dynamic 
enviroiunent  and  implement  different  learning 
agents,  using  the  different  frameworks,  to  survive  in 
it  The  environment  is  nontrivial  and 
nondeierministic.  Surprisingly,  all  of  the  agents  can 
learn  to  survive  fairly  well  in  a  reasonable  time 
frame.  This  papa  describes  the  learning  agents  and 
their  performance,  and  summarizes  the  learning 
algorithms  and  the  lessons  1  learned  from  this  study. 


1.  Introduction 

Reinforcement  learning  is  an  interesting  learning 
proUem.  It  requires  only  a  scalar  reinforcement  signal  as  a 
performance  feedback  from  the  environment 
Reinforcement  learning  often  involves  two  difficult 
subproblems.  The  first  is  called  the  credit  assignment 
problem.  Suppose  the  learning  agent  performs  a  sequence 
of  actions  and  finally  obtains  certain  outcomes.  It  must 
figure  out  how  to  assign  credit  or  blame  to  each  individual 
situation  (or  situatioii-actioo  pair)  to  adjust  its  decision 
making  and  improve  its  p^ormance.  The  second 
subproblem  arises  from  the  need  to  develop  the  appropriate 
internal  representations  required  to  achieve  the  target 
Iroming  ta^  In  the  course  of  learning,  both  subproblems 
must  be  solved. 

Several  reinfoicemcm  learning  frameworks  or  algorithms 
have  been  proposed  in  the  literature  (e.g.,  {Sutton, 
1984;  Williams,  1987;  Barto,  Sutton  &  Watkins, 
1989;  Watkins,  1989;  KaelbUng.  1989;  Sutton,  1990]). 
However,  most  have  only  been  studied  solving  simple 
learning  problems  (e.g.,  [Anderson,  1989]).  In  addition,  no 


serious  comparison  of  different  frameworks  has  been  done. 
This  work  is  thus  intended  to  be  a  first  step  towards  the 
investigation  and  evaluation  of  different  reinforcement 
learning  frameworks  in  solving  nontrivial  learning  tasks.  In 
particular,  I  am  interested  in  reinforcement  learning  using 
connectionist  networks. 

In  the  paper  I  study  four  reinforcement  learning 
frameworks,  which  are  adopted  bom  the  ideas  developed  in 
[Barto,  Sutton  &  Watkins,  1989;  Watkins.  1989;  Sutton, 
1990].  All  of  these  frameworks  are  based  on  two  learning 
proc^ures:  the  Temporal  Difference  (TD)  methods  [Sutton, 
1988]  for  solving  the  credit  assignment  problem  and  the 
error  backpropagation  algoritlm  [Rumielhart,  et  al., 
1986a]  for  developing  appropriate  inimal  representations. 

Generally  q)eaking,  reinforcement  learning  based  sdely 
on  the  TD  methods  is  a  slow  process.  In  domains  where 
reinforcements  are  sparse,  the  learning  rate  is  slow,  and  if 
the  cost  of  mistakes  (e.g.,  physical  damage)  is  also  high,  the 
agent  would  make  more  mistakes  than  allowed.  A  solution 
to  these  problems  is  to  learn  a  world  model,  and  practice 
with  the  model.  This  idea  is  embodied  in  two  of  the 
frameworks  studied  here. 

My  approach  to  evaluating  different  learning  frameworks 
is  to  design  a  dynamic  enviroiunent,  impkment  learning 
agents  to  survive  in  it  using  different  ftamewruks,  and 
evaluate  the  performance  of  the  agents.  Four  kinds  of 
objects  are  involved  in  this  environment:  the  agent,  fixed 
food  and  obstacles,  and  moving  enemies.  Although  survival 
in  this  environment  is  easy  for  humans,  it  is  by  no  means 
trivial  for  knowledge-poor  agents. 

The  remaining  of  this  paper  is  organized  as  follows. 
Section  2  discusses  the  four  learning  frameworks.  Sections 
describes  the  rules  of  the  environmenk  Sections  4  and  S 
present  the  impletnentation  and  pctfonnance  of  the  learning 
agents.  Section  6  assesses  the  merbs  of  the  agents.  Fuially, 
Section  7  concludes  the  paper  by  summarizing  the  lessons  I 
learned  from  this  study. 


2.  Reinforcement  Learning  Frameworks 
Learning  to  survive  in  an  unknown  environment  can  be 
characterized  as  a  kind  of  reirtforcement  learning.  In 
reinforcement  learning,  the  learning  agent  continually 
receives  sensory  inputs  from  the  environment,  selects  and 
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Abstract 

A  simple  model  is  presented,  aimed  at 
capturing  the  essential  traits  of  the  interacting 
behaviour  of  adapting  species.  The  modelling 
paradigm  followed  is  centered  on  intrinsic 
adaptation,  with  no  explicit  fitness  function, 
and  U  implemented  using  a  probabilistic 
cellular  automaton.  Some  simulation  results  are 
shown,  regarding  the  outset  of  a  predator 
species  and  prey/predator  population 
dynamics,  also  with  respect  to  environmental 
structural  changes. 

1.  Introduction 

Complex  systems  science  aims  at 
capturing  the  fundamental  characteristic  of 
adaptive  systems  composed  of  multitudes  of 
interacting  entities.  The  fundamental 
concept  around  which  all  its  approaches 
hinge  is  that  of  self-organization  (Nicolis, 
Prigogine,  1977],  that  is  the  emergence  of 
organized  behaviour  in  systems  which  were 
not  designed  explicitly  to  manage  entities  at 
the  level  of  the  outputs.  A  common  feature 
of  all  such  models  in  fact  regards  the  input 
specification  and  the  system  description, 
which  are  defined  at  an  aggregation  level 
quite  far  -  both  in  terms  of  object 
structuring  and  of  characteristic  time  scale 
•  from  that  of  the  output  of  interest. 


realities  unapproachable  with  traditional 
analytic  techniques.  A  paradigmatic 
example  is  the  evolution  of  an  ecosystem 
where  several  species  coexist.  In  this 
situation  every  species  affects  with  its 
presence  the  environment  it  lives  in, 
consisting  both  in  the  world  and  in  its  other 
occupants:  its  survival  probability,  along 
with  its  fitness  to  the  environment,  should 
therefore  continuously  be  readjusted.  In  the 
paper  we  present  a  model  where  fitnesses 
are  only  implicitly  dealt  with,  in  that  they 
emerge  from  environment  self*organizing 
evolution  and  can  be  computed  only  a 
posteriori.  Given  the  description  of  an 
environment  and  an  initial  uniform 
population,  we  simulate  the  adaptation  of 
the  population  to  the  environment,  along 
with  the  possible  outset  of  new  species. 

The  paper  is  organized  as  follows:  in 
section  2  we  introduce  the  simulation 
approach  followed  in  our  research  along 
with  the  essential  features  of  our  model,  in 
section  3  we  describe  the  methodology 
followed  to  introduce  interaction  among  the 
basic  individuals,  in  section  4  we  give  a 
more  detailed  description  of  our 
implemented  system  and  propose  some 
simulation  result.  Finally,  in  section  5,  we 
briefly  outline  our  current  activities 
regarding  extensions  of  the  model. 
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Abstract. 

In  this  paper  we  propose  a  mechanism  for  motivational 
competition  and  selection  of  behavior.  One  important 
characteristic  of  this  mechanism  is  that  the  selection  of 
behavior  is  mottled  as  an  emergent  property  of  a  paral¬ 
lel  process.  This  in  contrast  with  mechanisms  for  behavior 
selection  and  motivational  competition  proposed  earlier, 
which  are  based  on  a  hierarchical,  preprogrammed  con¬ 
trol  structure.  We  show  that  selection  of  behavior  can  be 
modeled  in  a  bottom-up  way  using  an  activationjinhibition 
dynamics  among  the  different  behaviors  that  cat  be 
selected.  There  is  no  weighing  up  of  behaviors  in  a  cogni¬ 
tive  manner  and  neither  ae  hierachical  or  bureaucratic 
structures  imposed.  The  paper  elaborates  upon  the  results 
we  obtained  with  simulaed  creaures  based  on  this 
mechanism.  It  draws  parallels  between  chaacteristics 
observed  in  ani.nal  behavior  and  characteristics  demon¬ 
strated  by  our  atificial  creatures.  Examples  ae:  displace¬ 
ment  behavior,  opportunistic  behavior,  fatigue,  selective 
aiuntion,  and  so  on. 


1.  Introduction 

This  paper  is  concerned  with  the  problem  of  behavior 
selection  for  an  artificial  creature.  The  context  in  which 
we  discuss  this  problem  is  that  of  the  behavior-based  sys¬ 
tems  (Brooks,  1986)  (Brooks,  1990),  which  embody  a  new 
philosophy  for  building  artificial  creatures,  inspired  by  the 
field  of  Ethology  (Me  Failand,  1981)  and  not  unrelated  to 
the  Society  of  Mind  theory  (Minsky,  1986). 


A  creature  is  viewed  as  consisting  of  a  set  of 
behaviors.  Examples  of  behaviors  are:  the  feeding 
behavior,  sleeping  behavior,  drinking  behavior,  etc.  Only 
a  few  ~  or  often  only  one  ~  of  these  behaviors  can  be 
active  at  a  time.  However,  a  creature  at  every  moment  is 
probably  motivated  towards  a  variety  of  them.  This  means 
that  there  has  to  be  some  mechanism  which  decides  which 
behavior  “wins”  and  as  such  gets  control  over  the  “mus¬ 
cles”  or  actuators  of  the  artificial  creature. 

In  the  case  of  simple  animals,  and  also  in  the  case 
simple  artificial  creatures,  the  qrtimal  strategy  can  be 
hard-wired  respectively  by  nature  (natural  selection)  ot  the 
programmer.  The  chwges  in  behavior  can  be  entirely 
preprogrammed,  and  selection  of  behavior  is  a  matter  of 
routine,  showing  very  regular,  rhythmic  patterns. 
Although  such  a  preptopammed  decision  strategy  may  be 
useful  for  creatures  Uving  in  a  very  stable  and  p^ctable 
environment,  it  does  not  suffice  for  creauues  that  have 
many  jobs  to  do  in  an  environment  in  which  the  opportun¬ 
ities  to  perform  a  job  vary  considerably  (see  Me  I^and, 
1981,  for  the  case  of  a  natural  creature)  (see  Maes.  1990b 
for  the  case  of  an  artificial  creature). 

Complex  creatures  need  a  more  flexiUe  behavior 
selection  mechanism  which  bases  selection  on  the  internal 
motivational  state  of  the  creature  as  well  as  on  external 
circumstances.  It  is  clear  from  observation  of  animal 
behavior  that  a  change  in  the  external  environment  may 
override  the  current  behavior,  for  example,  with  some 
alarm  response  behavior.  But  animals  also  demonstrate 
changes  in  behavior  without  a  change  in  the  external 
situation,  which  suggests  that  behavior  selection  is  also 
determined  by  internal  motivation.  E.g.  a  domestic  hen, 
when  present^  with  an  egg  may  eat  it  on  certain  occa¬ 
sions  or  brood  on  it  on  other  occasions  (Me  Farland, 
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Abstract 

Results  from  elcctrophysiology,  anatomy,  and  simu¬ 
lations  show  that  the  tectal  sensori-moior  system  in 
salamanders  can  be  understood  as  a  system  that  pre¬ 
serves  retinal  coordinates  on  both  tectal  hemispheres 
and  establishes  a  head-centered  three-dimensional 
coordinate  system  by  a  combination  of  the  bilateral 
visual  maps.  Thus,  the  outputs  of  the  sensory  maps 
are  results  of  retinal  inputs  and  intcrtcctal  signal 
transfer.  The  sensory  maps  are  connected  to  respec¬ 
tive  motor  maps  organized  in  the  coordinates  of  the 
neck  muscles.  The  bilateral  distribution  of  excitation 
in  the  whole  system  enables  the  animal  to  perform 
head  saccadcs  directed  toward  a  stimulus  at  an 
arbitrary  location  within  its  egocentric  frame  of 
reference. 

1.  The  biological  .system 

Research  on  the  system  controlling  saccadcs  in  amphibians 
has  largely  concentrated  on  the  visual  input  side.  It  is  well 
established  that  moving  stimuli,  like  squares  of  moderate 
size  or  bars  elongated  in  the  direction  of  movement^  elicit 
saccadcs  of  the  head  with  a  high  probability.  The  probabi¬ 
lity  is  reduced  if  stimuii  of  other  shapes  are  used. 

As  in  other  verlcbrales,  saccadcs  are  triggered  by 
the  optic  tectum  and  can  be  released  by  electrical  stimula¬ 
tion  of  this  area.  Tectal  output  units  arc  well  activated  by 
visual  stimuli  like  squares  or  horizontal  rectangles  which 
are  moved  across  their  receptive  Helds  (c.f.  Himstedt  ct  al., 
1987).  The  tectum  receives  an  orderly  arranged  input  from 
the  eyes,  establishing  a  retinolop'c  map  on  which  the  nasal 
visual  Held  is  represented  rostrally  and  the  temporal  visual 
Held  caudally,  the  superior  visual  Held  is  represented  close 
to  the  dorsal  midline,  whereas  the  vcnlro-lalcral  margin  is 
stimulated  by  visual  objects  in  the  lower  part  of  the  visual 
Held. 

Recently  we  were  able  to  show  that  an  ipsilatcral 
visual  map  cxi.sts  as  well.  It  is  e.stabiished  by  an  information 
iran.sfer  from  the  other  tectal  hemisphere  (that  receives  in¬ 
put  from  the  contralateral  eye).  The  ipsilatcral  map  repre¬ 
sents  only  the  binocular  part  of  the  visual  Held  and,  there¬ 
fore,  covers  only  the  rostral  half  of  the  optic  tectum.  The 
ipsilatcral  map  is  pointsymmetrical  to  the  contralateral  one 


and,  thus,  constitutes  the  ability  to  calculate  binocular  dis¬ 
parity  (Manteuffel  cl  al.,  1989). 

In  comparison  to  the  amount  of  knowledge  on  vi¬ 
sual  properties  of  tectal  neurons,  there  are  only  few  data  on 
sensori-motor  coupling  in  the  saccadc  control  .system.  It  is 
known  that  electrical  stimulation  of  the  toad’s  tectum  evo¬ 
kes  saccades  toward  a  location  that  roughly  corresponds  to 
the  retinotopic  map  established  by  afferents  from  the 
contralateral  eye  (Ewert,  1%7).  However,  recent  stimula¬ 
tion  experiments  have  shown  that  there  exists  no  simple 
correlation  between  stimulation  site,  stimulation  strength 
and  evoked  saccadc  (Jordan  et  al.,  1990).  It  is  likely  that  the 
motor  map  of  the  tectum  in  salamanders  is  not  arranged  in 
the  coordinates  of  external  space.  The  bimodal  distribution 
of  the  populations  of  efferent  neurons  in  a  dorsal  and  a 
ventrolateral  group  with  increasing  density  toward  caudal 
tectal  levels  (Fig.  1)  rather  indicates  a  recruitment  system 
arranged  in  the  coordinates  of  the  neck  muscle  system 
(Manteuffel,  1990;  Naujoks-Manteuflel  and  Manteuffel, 
1990).  According  to  this  principle,  larger  saccades  would  be 
evoked  when  more  premotor  neurons  become  activated.  In 
fact  large  saccades  are  necessary  toward  temporal  goals 
which  are  stimulating  more  posterior  sites  of  the  optic  tec¬ 
tum  >n  the  visual  domain. 

The  motor  system  executing  the  saccades  in  salti- 
manders  is  comparatively  simple.  The  basic  structure  of  the 
arrangement  of  the  muscles  responsible  for  head  mo¬ 
vements  has  most  likely  been  inherited  from  Hsh  ancestors 
with  a  basic  equipment  of  paired  cpn.xial  and  hypaxial  mu¬ 
scles  (i.c.  above  and  below  the  axis  of  the  spinal  column). 
With  a  joint  inlcrculalud  between  the  posterior  pole  of  the 
head  and  the  Hrsi  vertebra,  vertical  and  horizontal  articula¬ 
tions  can  occur.  Therefore,  the  resulting  head  movements 
can  be  described  best  in  the  eextrdinates  given  by  the  direc¬ 
tions  of  the  forces  of  the  two  pairs  of  mu.scle.s.  The  bilateral 
epaxial  muscles  (m.  intertransversarius  capitis  superior)  in¬ 
sert  at  the  car  capsule  and  the  hypaxiai  m.  rectus  cervicis  at 
the  os  triangularc.  Both  pairs  of  mu.scles  iaseri  caudally  al 
transverse  processes  of  the  second  and  third  vertebrae. 

Saccades  are  largely  ballistic  movements  in  sala¬ 
manders  (Werner  and  Him.stedt,  1985),  in  general  falling 
short  with  larger  horizontal  angles.  Therefore  two  ore 
more  saccadcs  arc  often  needed  to  bring  a  target  into  the 
center  of  the  visual  Held.  The  animals  approach  a  prey  by 
executing  a  saccadc  followed  by  a  few  steps  of  straight 
walk.  If  necc.s.sary,  this  sequence  can  occur  repetitively 
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Abstract 

This  paper  presents  a  neurobiologically-feasible  spatial 
representation  model.  The  model  was  implemented  and 
tested  on  a  physical  autonomous  mobile  robot.  It  was 
shown  to  be  both  computationally  simple  and  physically 
robust. 

The  described  model  is  a  possible  interpretation  of 
the  organization  and  function  of  the  rat  hippocampus. 
The  paper  presents  relevant  biological,  psychological, 
and  neurobiological  data,  and  gives  a  detailed  set  of 
comparisons  between  the  physical  hippocampus  and  our 
“synthetic”  rat  implementation.  The  implications  of  the 
many  similarities  are  described.  Finally,  areas  for  future 
study  in  both  biology  and  robotics  are  suggested. 


1  Introduction 

Most  animals,  including  humans,  spend  much  of  their 
waking  time  in  transit  from  one  place  to  another  [Wa¬ 
terman  89].  Purposefully  moving  about  requires  a 
system  for  spatial  modeling  integrated  with  the  mech¬ 
anisms  for  handling  navigation,  locomotion,  and  moti¬ 
vation.  These  systems  have  evolved  to  perform  with 
impressive  robustness.  Understanding  their  function 
has  long  been  a  goal  of  cognitive  scientists,  biologists, 
and  neuroscientists.  More  recently,  this  goal  has  been 
adopted  by  members  of  the  Artificial  Intelligence  and 
robotics  communities  interested  both  in  simulating  bio¬ 
logical  systems  and  designing  better  artificial  ones.  ' 
The  question  asked  by  both  conununities  is:  “What 
kind  of  spatial  information  is  stored?”  In  order  to  an¬ 
swer  it,  experiments  are  designed  to  test  where  on  the 
qualitative-to-quantitative  scale  the  representation  lies, 
and  whether  it  is  centralized  or  distributed.  This  paper 
describes  a  qualitative,  distributed  spatial  representa¬ 
tion  tested  empirically  on  a  mobile  robot. 


2  Cognitive  Maps 

A  cognitivt  map  is  a  generic  term  for  an  internal  repre¬ 
sentation  of  spatial  information.  The  term  has  come  to 
connote  a  very  analytical,  centralized  representation.  In 
this  paper,  we  will  use  the  term  in  its  generic  meaning, 
and  analyze  its  variants. 

A  cognitive  map  is  usually  assumed  to  represent  space 
with  a  set  of  landmarks,  each  of  which  is  an  element  (ob¬ 
ject  or  feature)  serving  as  a  point  of  reference  [Presson 
and  Montello  88].  According  to  Piaget,  a  landmark 
is  a  spatial  primitive,  and  thus  a  basic  building  block 
of  spatial  representations  [Piaget  and  Inhelder  67]. 
Although  most  landmark  studies  concentrate  on  visual 
cues,  the  concept  generalizes  to  any  perceptable  feature. 
Animals  construct  landmarks  from  auditory,  olfactory, 
and  tactile  cues  as  well  [Gould  82],  taking  advantage 
of  their  different  characteristics  [O’Keefe  89]. 

In  this  paper,  we  will  analyse  cognitive  maps  along 
two  dimensions:  1)  what  information  they  encode  and 
2)  how  they  encode  it.  The  “whai”  dimension  can  vary 
from  completely  qualitative  or  topological  to  very  quan¬ 
titative  or  metric.  The  “how”  dimension  varies  from 
totally  global  or  centralized  to  entirely  distributed  or 
decentralized. 

2.1  How  Qualitative? 

The  nature  of  the  representation  determina  the  type 
and  number  of  landmarks  required  for  localizing.  In  a 
qualitative  representation,  an  object  can  be  remembered 
as  being  proximate  to  a  landmark,  defined  within  a  ra¬ 
dius  around  it.  On  the  other  end  of  the  spectrum,  the 
position  of  an  object  can  be  computed  precisely  from 
the  known  locations  of  three  landmarks  [Pick,  Mon¬ 
tello  and  Somerville  88].  The  question  is  how  much 
metric  information  is  recorded. 

The  psychological  literature  is  divided  on  this  i&sue. 
Studies  testing  response  times  in  object  position  recall 
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ABSTRACT 

There  is  an  analogy  between  animal  and  product  design  that 
can  be  formulated  as  an  exact  mathematical  analogy.  The 
success  of  a  biological  design  is  measured  by  the  success  of 
the  genes  that  produce  it,  and  this  depends  upon  the  ability 
of  those  genes  to  increase  their  representation  in  the 
population  in  the  face  of  competition  from  rival  genes. 
Similarly,  when  a  variety  of  products  is  under  consideration, 
they  vary  in  the  period  required  for  product  development,  in 
the  chance  of  failure  in  the  market  place,  and  in  the  expected 
returns  from  sales  if  the  product  is  successful.  The 
development  period  refers  to  the  period  before  any  return  is 
tK:hieved  on  investment.  For  animals  this  is  the  period 
between  birth  and  reproduction,  and  for  products  it  is  the 
period  prior  to  time  that  financial  return  accrues  to  the 
investor.  The  success  of  a  design  is  evaluated  by  the  net  rate 
of  increase  of  the  genes  coding  for  it  (i.e.  the  return  on 
investment)  in  the  animal  case,  or,  in  the  case  of  a  product 
launched  into  the  markeq)lace,  of  the  money  invested  in  it 
If  we  are  to  take  the  biological  approach  to  robot  design 
seriously,  then  we  should  first  consider  the  ecological  (or 
market)  niche  that  a  proposed  robot  is  to  occupy.  Is  the 
robot  to  be  a  toy,  a  brick-laying  robot,  or  a  bomb-disposal 
robot?  Just  as  dim  are  no  general-purpose  animals,  so  there 
should  be  no  general-purpose  robots.  For  robot  behaviour  to 
be  adaptive,  in  terms  of  the  analogy,  it  must  optimise  with 
respect  to  the  selective  pressures  of  the  market  place.  Other 
forms  of  adaptation,  such  as  acclimatisation  and  learning, 
are  subject  to  the  same  criteria. 


The  term  adaptation,  as  used  in  biology  has  a  number  of 
meanings:  Biologists  usually  distinguish  bct^^cen  (1) 
evolutionary  adaptation,  which  concerns  the  w  jv ,  m  which 
species  adjust  genetically  to  chaiged  environmenul  conditions 
in  the  very  long  term;  (2)  physiological  adapiai  ion .  w  hic  h  has 
to  do  with  the  physiological  processes  involved  in  the  adjustment 
by  the  individual  to  climatic  changes,  changes  in  foodquality, 
etc.;  (3)  sensory  adaptation,  by  which  the  sense  organs  adjust 
to  changes  in  the  strength  of  the  particular  stimulation  which 
they  are  designed  to  detect;  and  (4)  adaptation  by  learning, 
which  is  a  process  by  which  animals  are  able  to  adjust  to  a 
wide  variety  of  different  types  of  environmental  change. 


Acclimatization  to  altitude.  Adaptive  changes  in  a  man 
breathing  rarefied  air  for  4  days,  followed  by  6  days  at 
sea  level. 

V  =  lung  ventilation,  E  =  serum  erythropoietin,  H  =  rate 
of  hemoglobin  synthesis,  R  =  fraction  of  red  blood  cells 
(after  Adolph,  1972) 


Days 


Fie.  1  Adaption  by  acclimatisation.  The  |diysiological  changes  that 
occur  in  acclimatisation  to  altitude  run  through  a  spectrum,  ranging 
from  fast  but  costly  processes  to  slow-acting  processes  that  ate  cheiq> 
in  energetic  terms. 

Adaptation  implies  costreduction.  as  can  be  seen  &om  the 
example  in  Figure  1.  In  animal  behaviour,  real  costs  relaie  to 
Iterwinian  fiuiess.  So,  in  considering  the  usage  of  the  term 
adaptation  in  robotics,  we  should  ask  if  there  a  concept 
equivalent  to  the  fitness  of  a  robot? 

I  will  argue  that  there  is  such  an  equivalent  conoqN,  and 
that  the  analogy  between  aninul  and  product  design  can  be 
formulated  as  an  exact  mathematical  analogy.  Briefly,  the 
success  of  a  biological  design  is  measured  by  the  success  of 
the  genes  that  produce  it,  and  this  depends  upon  the  ability  of 
those  genes  to  increase  their  representation  in  the  population 
in  the  face  of  competition  from  rival  genes.  How  does  this 
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SUMMARY 


Our  operating  hypothesis  can  be  stated  simply 
enough.  It  is  that  the  class  of  non-trivial 
solutions  to  the  problem  of  information 
management  is  small  (see  McGonigle,  1987).  if 
true,  it  follows  that  intelligent  systems  (however 
instantiated)  are  solution  constrained  a  priori 
and  must  converge  on  a  similar  design  logic  if 
they  are  to  succeed.  In  the  case  of  biologically 
instantiated  intelligence,  this  convergence  is  seen 
more  as  a  primary  consequence  of  such  solution 
constraint  (for  example,  the  power-generality 
trade-off),  and  less  one  of  genetic  affinity 
(although  it  may  be  secondarily  a  factor). 

Our  general  goal  is  the  characterisation  of 
intelligent  systems  in  the  broadest  sense, 
informed  in  (Articular  by  converging  research  in 
comparative,  developmental  and  cognitive 
psychology  situated  within  a  neuroscience 
framework  (see  McGonigle  and  Chalmers, 
1990a).  More  recently,  we  have  opened  a 
dialogue  with  roboticists  at  Edinburgh 
(Smithers,  Malcolm  and  Donnett,  in  particular) 
and  are  delighted  to  see  the  growing  basis  for 
productive  dialogue.  The  position  outlined  here 
also  intersects  with  one  espoused  by  Brooks 
(1986)  but  has  an  independent  origin  and 
rationale. 

The  goal  of  this  paper  is  to  summarise  some  of  the 
characterisations  which  emerge  from  work  on 
biological  systems  in  an  attempt  both  to  cross¬ 
check  with  designers  of  artificial  systems  and  to 
exchange  concepts  of  possible  mutual  benefit.  To 
limit  the  vast  area  under  review.  I  shall 
concentrate  on  the  incremental  aspect  of 
intelligent  systems  as  this  is.  as  I  see  it.  the  key 
issue.  ' 


CHARACTERISATION 

Although  there  are  (and  have  been)  many 
documented  approaches  to  the  study  of 
evolutionary  intelligence,  most  have  failed  on  the 
key  issue  of  how  systems  'invest  in  complexity' 
or  grow  (ontogenetically  or  phylogenetically) 
from  'weak  to  strong*.  Classical  ethology,  for 
example,  has  worked  best  with  simpler  'reactive' 
systems  which  'wear  their  adaptation’  on  their 
sleeve L  Certainly,  their  domain  of  inquiry  has 
precluded  the  study  of  human  problem  solving  and 
intelligence  either  from  a  comparative,  a 
cognitive  or  a  developmental  standpoint.  As  a 
consequence,  ethologists  may  offer  some  solace  to 
designers  interested  in  making  simple  reactive 
agents  as  a  first  step.  However,  and  crucially  for 
our  current  agenda,  they  offer  few  general 
characterisations  of  intelligent  systems  that 
could  afford  strong  clues  as  tw  the  ways  and  means 

^  A  psychologist  looking  at  these  reactive  systems 
as  a  group,  however,  might  characterise  their  group 
adaptation  as  an  example  of  the  power-generality 
trade-off,  both  in  terms  of  niche/habitat  selection 
and  in  the  specialisation  of  subsets  within  the  group 
as  in  (say)  bees.  However,  a  study  of  learning 
mechanisms  per  se  as  an  alternative  to  reactive 
agents,  fares  little  better.  Motivated  by  black  box 
behaviourism,  the  search  for  universal  laws  of 
learning  and  memory  has  failed  to  provkfo  any  stable 
correlates  of  species  differentiation  according  to 
phylogeny  or  brain  architectures.  Although  learning 
appears  early  in  the  evolutionary  process,  it  is  not 
the  fact  of  learning,  but  WHAT  is  learned  that 
differentiates  animals.  And  if  simple  habit  formation 
is  mainly  what  has  been  achieved  in  most  of  these 
experiments,  it  is  not  surprising  that  they  havn  not 
picked  up  direct  implications  of  advanc^  nervous 
systems,  for,  as  Mishkin  (1985)  has  claimed,  such 
habit  mechanisms  are  mediated  by  the  motor  cortex 
(an  'old*  area)  of  the  brain. 
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Abstract 


Following  a  general  presentation  of  the  numerous  means 
whereby  animats  -i.e.  simulated  animals  or  autonomous 
robots-  are  enabled  to  display  adaptive  behaviors,  various 
works  making  use  of  such  means  are  discussed.  This 
review  cites  172  references  and  is  organized  into  three  parts 
dealing  respectively  with  preprogrammed  adaptive 
behaviors,  with  learned  adaptive  behaviors,  and  with  the 
evolution  of  these  behaviors.  A  closing  section  addresses 
directions  in  which  it  would  be  desirable  to  see  future 
research  oriented,  so  as  to  provide  something  other  than 
protrfs  of  principle  or  ad  hoc  solutions  to  specific  problems, 
however  interesting  such  proofs  or  solutions  may  be  in 
their  own  right 


1.  INTRODUCTION 

In  a  changing,  unpredictable,  and  more  or  less  threatening 
environment  the  behavior  of  an  animal  is  adaptive  as  long 
as  the  behavior  allows  the  animal  to  survive.  Under  the 
same  conditions,  the  behavior  of  a  robot  is  considered  to  be 
adaptive  as  long  as  the  robot  can  continue  to  perform  the 
functions  for  which  it  was  built  Now,  the  survival  of  an 
animal  is  intimately  involved  with  its  physiological  state 
and  the  successful  operation  of  a  robot  depends  upon  its 
mechanical  condition.  Under  these  circumsunces,  it  is 
obvious  that  one  can  associate  with  an  animat  •  whether  the 
term  indicates  a  simulated  animal  or  an  autonomous  robot 
(Wilson,  1985, 1987a)  -a  certain  number  of  state  variables 
upon  which  its  survive  or  siKcesssful  operauon  depend,  and 
that  each  of  these  state  variables  is  characterized  by  a  range 
of  variation  within  which  the  animat’s  continued  survival 
or  operation  are  imserved.  Such  variables  *  ere  referred  to 
as  essential  variables  by  Ashby  (1952)  long  ago.  Their 
variation  ranges  describe  a  viability  zone  inside  the  given 
state  space,  and  the  animat  can  be  referenced  at  any  instant 
by  a  point  within  this  zone  (Figure  1).  Under  the  influence 
of  environmental  or  behavioral  variations  affecting  the 
animat,  the  corresponding  reference  point  moves  and  may  at 
limes  come  close  to  the  limits  of  the  viability  zone.  In 
this  case,  the  animat's  behavior  can  be  called  adaptive  .so 


long  as  it  avoids  transgressing  the  viability  boundary 
(Ashby,  1952;  Sibly  &  McFarland,  1976). 


V2 


Figure  J.  Viability  zone  associated  with  two  essential 
variables.  VI  and  V2.  The  animat’s  behavior  is  adaptive 
because  corrective  action  has  been  taken  at  point  B,  so  as  to 
avoid  crossing  out  the  corresponding  viability  tone  at  point  A. 

Such  behavior  can  be  generated  by  means  of  several 
diHerent  or  complementary  abilities  arid  architectures.  For 
example,  the  laws  governing  the  animat's  (iteration  may 
rely  upon  various  homeosutic  mechanisms  thanks  to 
which,  if  the  reference  point  alluded  to  earlier  moves  away 
from  an  ad^ted  point  of  equiUbtium  -athqned  because  it  is 
suitably  located  within  the  viability  zone-,  this  proc^ 
lends  to  return  it  to  its  original  position,  therc^  decreasing 
the  risk  that  it  will  pass  outside  the  limits  of  the  zone. 
Other  ways  in  which  to  lower  this  risk  involve  the  use  of 
high-quality  sensory  organs  or  motor  apparatus  that  allow 
the  animat  to  detect  as  early  as  possible  that  it  is 
approaching  these  limits  and/or  to  move  away  from  them 
quickly  and  effectively.  In  this  line  of  reasoning,  it  is 
obvious  that  the  equivalent  of  a  nervous  system  is 
mandatory  in  order  to  connect  the  animat's  perceptions  with 
its  actions  and  that  reflex  circuits  aciivat^  as  quickly  as 
possible  increase  the  adaptive  nature  of  its  behavior.  It  is 
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Abstract 

The  thrust  of  this  paper  is 
to  meet  the  objections  of  what 
may  be  termed  a  ph i i osoph i ca I 
school,  whose  principals  are 
Robert  Rosen,  Howard  Pattee,  and 
Peter  Cariani.  The  objection 
that  a  computational  universe 
is  a  flat  "pseudo-world," 
because  it  is  "all  syntax  and 
no  semantics,"  is  inquired  into 
and  refuted,  as  is  the  claim 
that  nothing  really  new  can 
evolve  within  such  an  artificial 
universe.  It  is  concluded  that 
no  persuasive  reasons  have  been 
advanced  as  to  why  computational 
artificial  life  is  not  feasible. 

A  convergence  of  several 
fields  has  resulted  in  the  new 
discipline  of  Artificial  Life 
("AL")  research.  And  just  as  AL 
science  has  several  sources,  it 
is  moving  In  several  distinct 
directions.  Life-like  entities 
are  being  developed  as  biochem¬ 


ical  "wetware,"  robotic  hard¬ 
ware,  and  as  computer  software. 
This  paper  shall,  however, 
deal  only  with  controversies 
surrounding  the  third  variety, 
namely  the  computational  AL 
form. 

Generally  stated,  the  AL 
program  is  to  develop  life-like 
organisms  in  the  medium  of 
choice.  For  myself  and  some 
other  AL  researchers,  the 
computer  is  our  medium  of 
choice.  Our  objective  is  to 
implant  or  evolve  Individuals 
or  colonies  in  automaton 
universes,  to  observe  instances 
of  propagation,  adaptation, 
or  communication,  such  as  one 
usually  associates  with 
life  forms. 

Since  the  pioneering 
work  of  von  Neumann  (1 966)1, 
cellular  automata  have  been 
much  used  as  computational 
media  for  AL  research.  In 
recent  years,  more  sophist¬ 
icated  systems,  for  example 
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Abstract 

Th*  Really  Uaefnl  Robota  (RUR)  project  ia  aeek- 
ing  to  aaderatand  how  robots  can  be  built  that 
develop  and  maintain  the  task  achieving  compe¬ 
tences  they  reqnire  for  flexible  and  robust  be¬ 
haviour  in  variable  and  unforeseen  situations,  as 
opposed  to  these  being  installed  by  their  design¬ 
ers.  In  this  paper  we  present  an  experimental 
autonomous  robot  with  a  map  building  compe¬ 
tence  which  uses  a  self-organising  network.  Map 
building  forms  a  necessary  step  on  the  way  to 
development  of  a  navigational  competence.  Some 
encouraging  initial  test  results  are  also  presented. 


1  Introduction 

The  traditional  approach  to  control  in  (mobile)  robots 
ia  to  decompose  the  task  into  separate  components,  and 
implement  these  using  standard  control  techniques,  see 
[Levi  1987},  for  example.  This  we  call  an  ana/yiiea/  ap¬ 
proach,  Alternatively,  a  control  structure  can  be  built 
'bottom  up',  first  building  foundational  competences 
(such  as  ‘move  around  and  avoid  obstacles’),  and  later 
on  top  of  these  more  complicated  competences  (such  as 

*®U.  Xehmsew  and  T.  Smithert,  May  1990 

^Names  appear  in  aiphabetiea]  order,  with  both  bein;  principal 
authors  on  this  occasion. 


‘explore’,  ‘map  building’,  and  ‘map  using').  This  we  call 
a  tyaikttic  approach,  see  [Brooks  1986],  for  example. 

At  Edinburgh  we  have  adopted  a  synthetic  approach 
in  what  we  call  the  'Really  Useful  Robots’  project  (RUR) 
[Nehmsow  et  al  1989).  This  project  is  attempting  to  de¬ 
velop  a  control  architecture  which  supports  the  devel¬ 
opment  of  task  achieving  competences  by  the  robot.  In 
other  words,  we  are  trying  to  understand  how  a  robot 
can  sequentially  acquire  and  maintain  the  behavioural 
competences  it  requires,  rather  than  have  them  ‘in¬ 
stalled’  by  us  as  its  designers.  We  believe  that  this 
autonomoas  aeqaisition  of  task  achieving  competences 
will  lead  to  greater  flexibility  and  robustness  ia  the  be¬ 
haviour  of  robots  with  respect  to  variable  and  unforseen 
situations.  In  investigatmg  this  idea  we  are  motivated 
and  informed  by  the  adaptive  control  mechanisms  we 
see  in  simple  animals  which  result  ia  them  having  flex¬ 
ible,  reliable,  and  robust  competences  well  matched  to 
the  tasks  they  are  responsible  for  achieving  and  to  the 
environment  in  which  they  are  exercised. 

Trying  to  get  a  robot  to  acquire  the  skills  it  needs 
means  that  as  many  decisions  as  possible  are  left  to 
the  robot,  rather  than  being  predefined  by  the  designer. 
Alder,  the  first  of  the  ‘Really  Useful  Robots*  (see  fig¬ 
ure  6)^,  is  able  to  adapt  to  a  changing  environment, 
and  to  acquire  useful  competences.  It  uses  what  we  call 
fixed  and  plastic  components  in  its  control  architecture 

*  Alder  it  a  mobile  robot  whose  bate  it  built  with  a  Fitehertech* 
nik  kit.  It  it  about  35cm  loni;  hat  an  8053  bated  mierecomputer 
on  board  (16k  RAM)  and  it  equipped  with  up  to  teren  tactile 
tensors  plus  odometer.  In  addition  to  this  a  sonar  tensor  is  avail¬ 
able,  but  hat  not  been  used  to  obtain  the  results  presented  in 
this  paper.  More  information  about  Alder  and  the  ‘Really  Useful 
Roboti'  approach  can  be  found  in  [Nehmsow  et  al  1089|. 
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ABSTRACT. 

This  paper  investigates  the  evolutionary  imitation.  Insect  colonies  (e.g  colonies  of  ants. 


development  of  (problem  solving)  behavior. 
Through  evolution,  artificial  animals  learn 
to  survive  in  a  given  world. 

We  use  layered  neural  networks  (NNs)  as  the 
substrate  on  which  evolutionary  learning 
operates.  The  fault  tolerance  of  neural 
networks  allows  for  a  genotype  /  phenotype 
distinction  which  maintains  the  variation  in 
the  genetic  pool.  Furthermore,  we  define 
building  blocks  which  take  into  account  the 
functionality  of  the  NNs. 

The  result  of  the  algorithm  can  be  inspected 
at  two  levels.  First,  there  is  the  behavior  of 
the  individual  animals.  A  description  of  their 
behavior  is  obtained  through  the  induction  of 
decision  trees  which  describe  the  function¬ 
ality  of  the  NN.  Second,  the  behavior  of  the 
population  as  a  whole  can  be  described.  The 
distribution  of  the  animals  over  the  world 
often  provides  an  analogical  representation  of 
a  problem  solution. 


Keywords',  autonomous  agents,  evolutionary 
learning,  genetic  algorithms,  inductive 
learning,  machine  learning,  neural  net¬ 
works. 

1.  Introduction. 

Natural  evolution  continues  to  intrigue  mankind. 
Particularly,  the  complexity  to  which  it  leads 
often  surprises  us.  Examples  of  this  complexity 
are  abundant.  Animals  in  a  prey  •  predator  re¬ 
lation,  for  example,  develop  complex  defensive 
and  offensive  behavior,  such  as  camouflage  and 


bees,  wasps,  termites  etc.)  are  another  instance 
of  the  complexity  to  which  evolution  can  give 
rise.  Such  complexity  is  obvious  if  one  examines 
the  nests  termites  build  or  the  social  organisation 
within  insect  colonies  [Wilson  85].  Many  other 
examples  can  be  found  in  [Dawkins  86]  or 
[Tinbergen  65],  amongst  others. 

Computational  methodologies  based  on  the 
evolutionary  metaphor  have  been  developed  for  a 
wide-range  of  problems,  such  as  search,  opti¬ 
mization  and  machine  learning.  One  of  the  most 
notable  methodologies  are  genetic  algorithms  (a 
good  overview  of  QAs  can  be  found  in  [Goldberg 
89]).  Another,  related,  methodology  is  evolu¬ 
tionary  learning,  which  searches  through  a  space 
of  behaviors  using  the  principles  of  variation  and 
selection.  In  contrast  with  GAs,  evolutionary 
learning  does  not  require  an  explicit,  domain 
specific  fitness  measure.  We  only  need  to  specify 
the  characteristics  of  the  environment  (e.g.  the 
amount  of  food  present)  and  the  effects  of  an  an- 
imafs  actions  on  itself  and  on  the  environment. 

In  this  paper  we  propose  an  evolutionary 
framework  in  which  successive  generations  of 
animals  learn  to  improve  their  chance  of  survival 
in  a  given  environment.  Or,  in  other  words, 
successive  generations  adaptively  develop  be¬ 
havior  (such  as:  look  for  food,  avoid  predators  ...) 
in  correspondence  with  their  natural  needs.  In 
order  to  be  successful,  an  animal  has  to  find  an 
answer  to  the  following  question:  When  should  I 
perform  which  actions?  This  question  is  an¬ 
swered  through  evolutionary  learning  on  NNs. 
Our  system  learns  only  through  evolution,  no 
life-time  learning  mechanism  is  incorporated. 
We  should  stress  that  our  primary  goal  is  to  ob¬ 
tain  ^equate  (problem  solving)  behavior.  The 
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Abstract 

Animat  research  has  already  produced 
interesting  concepts  and  algorithms.  In  this 
paper,  we  analyze  how  this  research  can  be 
applied  to  human  intelligence  understanding 
and  to  reproducing  of  some  expert  behaviors. 
To  support  these  ideas,  we  experiment  with  an 
improvement  of  Boole,  a  Genetic  Based 
Learning  algorithm  from  animat  research,  in  a 
medical  domain  of  expertise.  We 
experimentally  demonstrate  that  our  system 
obtains  good  results  on  a  well  known  realistic 
medical  diagnosis  task,  and  we  analyze  its 
potential  ability  to  solve  more  complicated 
problems. 


Introduction 

There  has  been  much  debate  about  how  one  can 
consider  that  a  system  is  intelligent,  most  of  the  time 
according  to  how  it  processes  information  (rules, 
neurons  etc.)  in  connection  with  the  human  brain. 
However,  in  [Wilson,  85]  Wilson  developed  the  idea 
that  we  could  probably  learn  more  from  ethology,  and 
he  introduced  the  concept  of  animats,  which  are 
autonomous  systems  which  learn  how  to  survive  and 
expand  in  a  given  environmenL 
We  propose  to  discuss  how  this  research  can  be 
profitable  to  the  understanding  of  human  intelligence, 
and  how  aninut  algorithms  can  be  used  to  reproduce 
some  intelligent  human  behavior. 


1.  Intelligence  hierarchy 

Let  us  consider  the  following  intelligent  systems 
hierarchy  based  on  how  explicit  the  input  knowledge 
from  the  environment  must  be:  systems  that  Icom  by 
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being  told,  systems  that  learn  by  complete  examples, 
and  animats,  i.e.  systems  that  learn  by  reward. 


1st  level:  Systems  that  learn  bv  being  told 

Most  computers  get  their  knowledge  by  being  given 
programs,  i.e.  a  list  of  instructions  to  be  executed  in  a 
specified  order,  the  processing  is  completely  explicit 
in  the  input  knowledge . 

Production  systems  without  learning  ability  get  their 
knowledge  from  rules  which  are  used  to  reason  about 
the  input  data  and  conclude  about  the  output  data  to 
provide.  They  are  somehow  more  intelligtM  because 
the  order  in  which  rules  are  executed  depends  on  the 
data:  control  is  dau  driven.  This  means  that  a  few 
rules  implicitly  specify  many  different  reasoning 
traces. 

Specifically  built  neural  networks  get  their  knowledge 
from  a  set  of  predetermined  weights  which  indicate 
how  a  formulai^  hypothesis  (miao-feaiuie  represented 
by  one  neuron)  influences  another.  However,  without 
learning,  they  are  not  really  much  more  intelligent 
than  pr^uction  systems,  but  some  experiments  tend 
to  show  that  they  are  less  brittle  and  noise  sensitive 
and  can  exhibit  even  richer  behavior  than  can  be 
expected  because  they  use  parallel  analog  fmxmssing. 


2nd  level:  Systems  that  learn  bv  complete  examnles 

Learning  systems  can  manage  with  even  less 
formalized  knowledge:  they  only  need  examples  which 
contain  both  the  input  and  the  corresponding  desired 
output;  they  take  care  of  extracting  the  appropriate 
knowledge  that  is  needed  to  generalize  the  sampled 
behavior  to  new  inputs.  Such  systems  can  be  either 
rule  based  systems,  connectionist  networks,  or 
classifier  systems.  A  number  of  learning  algorithms 
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Abstract 

We  consider  the  problem  of  a  robot  with 
uninterpreted  sensors  and  effectors  which  must 
learn,  in  an  unknown  environment,  behaviors 
(i.e.,  sequences  of  actions)  which  can  be  taken 
to  achieve  a  given  goal.  This  general  prob¬ 
lem  involves  a  learning  agent  interacting  with 
a  reactive  environment:  the  agent  produces  ac¬ 
tions  that  affect  the  environment  and  in  turn 
receives  sensory  feedback  from  the  environ¬ 
ment.  The  agent  must  learn,  through  experi¬ 
mentation,  behaviors  that  consistently  achieve 
the  goal.  The  difficulty  lies  in  the  fact  that  the 
robot  does  not  know  a  priori  what  its  sensors 
mean,  nor  what  effects  its  motor  apparatus  has 
on  the  world. 

We  propose  a  method  by  which  the  robot 
may  analyze  its  sensory  information  in  order 
to  derive  (when  possible)  a  function  defined 
in  terms  of  the  sensory  data  which  is  maxi¬ 
mized  at  the  goal  and  which  is  suitable  for  hill¬ 
climbing.  Given  this  function,  the  robot  solves 
its  problem  by  learning  a  behavior  that  maxi¬ 
mizes  the  function  thereby  resulting  in  motion 
to  the  goal. 

1  The  credit  assignment  problem 

The  learning  problem  addressed  in  this  paper  is  illus¬ 
trated  in  Figure  1.  The  learning  agent,  which  we  are 
calling  a  ‘’critter,”  receives  sensory  input  (vector  s)  from 
the  world  and  acts  on  the  world  via  motor  outputs  (rep¬ 
resented  by  a,  the  action  vector).  In  addition,  the  critter 
has  access  to  a  reward  signal,  r,  by  which  it  knows  when 
it  has  achieved  its  goal.  (In  the  experiments  discussed 
later,  the  reward  signal  is  incorporated  into  the  sense 
vector  for  simplicity.)  The  critter’s  task  is  to  learn  a  be¬ 
havior  which  reliably  achieves  the  goal.  This  behavior 
is  a  sequence  of  actions  (most  likely  dependent  on  the 
concomitant  sequence  of  sense  vectors)  which  takes  the 
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Figure  1:  The  general  problem  is  for  the  learning  agent, 
the  ’’critter,”  to  leant  sequences  of  actions  which  produce 
rewards.  The  critter  is  rewarded  when  it  is  in  a  goal  state. 

critter  from  its  present  state  to  the  goal  state.  The  prob¬ 
lem  is  difficult  because  the  reward  signal  does  not  pro¬ 
vide  feedback  for  every  action.  The  critter  only  knows 
that  it  has  done  the  right  thing  when  it  stumbles  onto 
the  goal  and  is  rewarded.  It  is  then  faced  with  the  credit 
assignment  problem  ~  the  problem  of  deciding  which  ac¬ 
tions  led  to  the  goal. 

2  A  solution 

In  this  paper,  we  propose  the  following  solution  to  this 
problem: 

1.  Derive  a  function  defined  in  terms  of  the  sense  vec¬ 
tor  (which  is  itself  a  function  of  the  state  of  the 
world)  such  that  this  function  is  maximized  at  the 
goal  state  and  is  suitable  for  hill-climbing.  It  may 
in  some  cases  be  impossible  to  find  such  a  function, 
in  which  cases,  the  method  fails. 

2.  Learn  a  behavior  that  does  gradient  ascent  on  this 
hill-climbing  function. 

The  problem  explored  in  this  paper  can  be  viewed  as 
the  problem  of  learning  a  hill-climbing  function  to  re¬ 
place  an  a  priori  function  where  the  latter  is  not  appro¬ 
priate  for  gradient  ascent.  In  general,  this  problem  can 
be  described  as  follows:  There  is  some  function,  given  a 
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Abstract 

Experimentation  with  animal  simulations  is 
limited,  in  lar^  part,  by  the  difficulty  of 
converting  ethological  ideas  into  computer 
programs.  *Logo  is  a  new  programming  language 
that  aims  to  make  it  easier  for  non-expert 
programmers  (researchers  as  well  as  students) 
to  develop  and  modify  their  own  simulations. 
*Logo  is  designed  specially  for  simulating 
colony-level  behaviors — that  is,  group 
behaviors  that  emerge  from  interactioirs  among 
hundreds  or  thousands  of  individual  creatures. 
Unlike  most  simulation  languages,  *Logo  gives 
the  creatures'  environment  an  equal 
computational  status  to  the  creatures 
themselves.  Users  write  rules  for  creatures  and 
for  "patches"  of  the  environment,  then  observe 
the  higher-level  behaviors  that  result.  A 
sample  *Logo  program  shows  how  local, 
parallel  actions  among  ants  can  lead  to  spatial- 
extended  and  temporally-sequential  patterns  in 
the  colony-level  tehavior. 


1.  Introduction 

During  the  past  several  years,  a  growing  number  of 
researchers  have  begun  creating  computer-based 
simulations  of  animal  behavior.  Some  are  motivated 
by  ethological  goals:  they  hope  to  gain  a  better 
understanding  of  the  mechanisms  underlying  the 
behaviors  of  real  animals.  Others  are  motivated  by 
engineering  goals:  they  hope  that  simulations  of 
aninnals  will  provide  ideas  (or  at  least  inspiration) 
for  building  computers  and  robots  that  function  more 
effectively  in  the  world. 

Unfortunately,  designing  and  programming  animal 
simulations  typically  requires  significant 
programming  expertise.  Most  animal  simulations  are 
developed  as  customized  programs,  by  experienced 
programmers.  Although  there  are  several  new 


languages  and  tools  designed  to  facilitate  the 
development  of  animal  simulations,  even  these  tools 
are  meant  primarily  for  experienced  programmers. 

As  a  result,  it  is  difficult  for  non-experienced 
programmers  to  convert  ethological  ideas  into 
computer  simulations.  Certainly,  novice  programmers 
can  change  parameters  or  initial  conditions  on 
existing  simulations.  But  they  are  not  able  to  make 
more  serious  modirications  or  create  entirely  new 
simulations.  In  short,  animal  simulations  are  still  not 
for  the  masses. 

What  is  needed  is  a  new  type  of  programming 
language  that  allows  people  to  more  easily  create 
and  experiment  with  animal  simulations.  This  paper 
describes  a  language  that  aims  to  do  just  that  The 
language,  call^  *Logo  (pronounced  star-logo),  is 
designed  especially  for  simulating  "colony-level 
behaviors"— that  is,  group  behaviors  that  emerge  as 
large  numbers  of  individual  animals  interact  with 
one  another,  as  in  bird  flocking  or  ant  foraging.  These 
types  of  simulations  are  particularly  difficult  to 
construct  using  traditioiuil,  sequential  programming 
languages.  Indeed,  simulations  of  colony-level 
behaviors  highlight  the  need  for  a  new  "massively 
parallel"  approach  to  programming,  in  which  many 
"computational  creatures"  act  in  parallel  (at  least 
conceptually,  if  not  in  reality). 

Section  2  describes  the  audience  for  *Logo. 
Although  *Logo  is  designed  primarily  to  help 
students  explore  self-organizing  phenomena,  it  could 
serve  equally  well  as  a  tool  for  ethologists.  Section  3 
discusses  the  computational  requirements  for 
programming  colony-level  animal  simulations. 
Section  4  discusses  the  central  ideas  underlying  the 
design  of  *Logo,  including  the  decision  to  treat  the 
creatures'  "world"  as  an  active  computational  actor, 
equal  in  status  to  the  creatures  themselves.  Section  5 
presents  examples  of  *Logo  simulations.  A  simulation 
of  ant  foraging,  for  example,  shows  how  local, 
parallel  actions  by  hundreds  of  individual  ants  can 
result  in  spatially-extended  and  temporally- 
sequential  behaviors  by  the  colony  as  a  whole. 
Section  6  describes  future  directions  for  *Logo. 
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.  Abstract 

Classifier  systems  (CSs)  have  been  used  to 
simulate  and  describe  the  behavior  of  adap¬ 
tive  organisms,  animats,  and  robots.  How¬ 
ever,  classifier  system  implementations  to  date 
have  all  been  reactive  systems,  which  use  sim¬ 
ple  S-R  rules  and  which  base  their  learning  al¬ 
gorithms  on  trial-and-error  reinforcement  tech¬ 
niques  similar  to  the  Hullian  Law  of  Effect. 
While  these  systems  have  exhibited  interesting 
behavior  and  good  adaptive  capacity,  they  can¬ 
not  do  other  types  of  learning  which  require 
having  explicit  internal  models  of  the  external 
world,  e.g.,  using  complex  plans  as  humans  do, 
or  doing  “latent  learning”  of  the  type  observed 
in  rats.  This  paper  describes  a  classifier  system 
that  is  able  to  learn  and  use  internal  models 
both  to  greatly  decrease  the  time  to  learn  gen¬ 
eral  sequential  decision  tasks  and  to  enable  the 
system  to  exhibit  latent  learning. 

1  Introduction 

Classifier  systems  (CSs)  have  been  used  to  un¬ 
derstand,  through  metaphor  and  simulation,  the 
behavior  of  adaptive  organisms  and  robots  from 
animats  ([Holland  and  Reitman,  1978],  [Booker,  1982], 
[Wilson,  1985])  to  rabbits  [Holyoak  et  a/.,  1990]  to  hu¬ 
mans  [Holland  et  a/.,  1986].  However,  all  CSs  imple¬ 
mented  to  date  have  been  reactive  systems,  i.e.,  they 
store  and  use  knowledge  as  rules  of  the  form  Tf  the  sit¬ 
uation  is  X,  do  A”  (where  X  may  describe  a  set  of  world 
states).  In  the  terms  of  animal  psychology,  the  system 
stores  S-R  associations  [Walker,  1987].  These  systems  all 
have  used  a  trial  and  error  learning  technique,  the  bucket 
brigade  algorithm  to  assign  priorities  to  rules; 

the  priorities  determine  which  rules  will  fire  and  what  the 
system  will  do  in  a  given  situation.  The  DBA  is  remi- 
nicent  ot  the  Hullian  “Law  of  Effect,”  i.e.,  rules  active 
when  reward  is  received  (from  the  environment  or  from 

'The  bucket  brigade  algorithm  ia  a  temporal  difTercnce  method 
{Sutton,  198S|. 


subsequently  active  rules)  have  their  priorities  modified 
in  proportion  to  the  reward.  (They  use  a  more  sophisti¬ 
cated  learning  algorithm,  the  genetic  algorithm,  to  form 
generalizations  over  the  space  of  situations,  i.e.,  to  form 
concepts.)  These  CSs  use  only  a  very  simple  model  of 
the  world,  in  which  a  rule’s  priority  in  effect  predicts 
the  reward  expected  if  that  rule  is  fired.  Despite  us¬ 
ing  such  simple  representational  and  learning  techniques, 
CSs  have  shown  surprisingly  interesting  behavior  when 
controlling  animats  which  must  learn  and  adapt  to  sur¬ 
vive  in  simple  enviroments;  they  have  also  been  used 
to  solve  concept  learning  [Wilson,  1987a],  dynamic  con¬ 
trol  [Goldberg,  1988],  and  sequential  decision  problems 
([Grefenstette,  1988],  [Booker,  1989]). 

A  second  type  of  learning  involves  building  more  com¬ 
plex  models  which  not  only  predict  rewards,  but  also 
predict  world  states?  These  models  can  be  implemented 
as  rules  of  the  form  “If  the  situation  is  X,  and  I  do  A, 
then  expect  situation  Y,”  i.e.,  S-R-S  associations.  Sys¬ 
tems  can  create  and  update  these  models  continuously, 
even  when  no  rewards  are  being  received,  by  predicting 
the  (non-reward)  outcomes  of  actions  and  then'  modify¬ 
ing  the  model  when  the  predictions  are  incorrect.  That 
is,  rather  than  just  using  the  usually  infrequent  feed¬ 
back  provided  by  rewards  or  punishments  to  build  sim¬ 
ple  S-R  models,  systems  can  exploit  the  flood  of  non¬ 
reward  experiences  they  have  to  build  much  more  com¬ 
plete  models  of  the  world.  Predictions  of  expected  states 
then  can  be  integrated  with  motivations  and  predic¬ 
tions  of  rewards  to  choose  actions  that  lead  to  goals. 
Using  internal  models  enables  systems  to  reduce  the 
numl^r  of  trials  required  to  learn  tasks  ([Sutton,  1990], 
[Whitehead  and  Ballard,  1989]).  Further,  a  model  en¬ 
ables  a  system  to  easily  integrate  newly  acquired  knowl¬ 
edge  about  the  world  or  about  changes  in  goals  or  mo¬ 
tivations  ([Holland  et  a/.,  1986],  [Dickinson,  1980]).  In¬ 
ternal  models  also  have  been  used  to  simulate  Piagetian 
cognitive  development  during  infancy  ([Drescher,  1986], 
[Drescher,  1989]). 


^UnleM  ttsted  othenrwe  “internal  moder  and  “model”  will  re¬ 
fer  to  thie  more  complex  type  of  model. 


Cognitive  Action  Theory  as  a  Control  Architecture 


H.  L.  Roitblat 

Department  of  Psychology 
University  of  Hawaii  at  Manoa 
Honolulu,  Hawaii,  USA  96822 
herbert@uhccux.uhcc.hawaii.edu 


The  three  laws  of  robotics:  (1)  robot  may  ham  a  human  being  or  throu^i  inaction, 
allow  a  human  being  to  come  to  harm.  (2)  A  robot  must  obey  the  orders  given  it  by 
human  beings  except  where  such  orders  would  conflict  with  the  First  Law.  (3)  A  robot 
must  protect  its  own  existence  as  long  as  such  protection  does  not  conflict  with  the  First 

or  Second  laws.  Asimov,  1950/1977,  p.  40. 


Abstract 

Standard  versions  of  control  theory  approach  their 
limits  in  autonomous  robodc  control  because  of 
their  conceptualization  in  terms  of  a  fixed  mapping 
between  environmental  variables  and  behavior. 
G)gnitive  action  theory,  m  contrast,  views  behavior 
as  hierarchically  organized  by  a  network  of 
interacting  nodes.  Nodes  at  different  levels 
represent  action  with  different  degrees  of 
abstraction.  Activation  of  nodes  in  the  hierarchy  is 
controlled  by  potentiation  and  inhibition  received 
from  other  nodes  and  by  environmental  stimulus 
information.  Learning  consists  of  the  formation  of 
connections  between  nodes  as  the  result  of  locally 
available  information  about  the  satisfaction  of 
cybernetic  feedback  functions.  Such  networks  are 
capable  of  planning  and  executing  highly  flexible 
behaviors  in  complex,  uncertain  environments. 

1.  Introduction 

Among  the  difficult  problems  for  designers  of 
autonomous  robot  systems  are  control  of  (a)  nonlinear 


systems,  (b)  stochastic  systems,  (c)  systems  requiring 
sensor  fusion  and  feedback,  and  (d)  systems  with 
mcomplete  information  about  the  environment  and  its 
structure  (Meystel,  1988).  Many  of  these  problems  are 
exacerbat^  by  the  control-theory-dominated  approach 
that  robotic  designers  have  taken.  Although  control 
theory  has  been  remarkably  successful  m  many  domains, 
it  approaches  its  limits  m  autonomous  robotic  control, 
because  in  its  standard  versions  it  assumes  that  behavior 
can  be  adequately  conceptualized  as  a  system  with  a  fixed 
mapping  between  environmental  variables  and  behavioirs. 
It  tipically  characterizes  the  behavior  of  the  robot  in 
terms  of  movements  or  operations  that  are  precisely 
defmed  as  spedfic  responses  to  specific  environmental 
conditions  (Brooks,  1986).  This  approach  works  well  in 
situations  in  which  variability  is  limited,  goals  are  simple 
and  noncompetitive,  uncertainty  is  minimal  (or  at  least 
statistically  characterizable),  and  the  environment  and 
actions  appropriate  to  it  are  fairly  exhaustively  known 
(e.g.,  a  factory  environment).  The  approach  is  likely  to 
prove  ultimately  inadequate,  however,  in  highly  variable 
endrooments,  in  the  face  of  complex  competing  demands, 
and  in  situations  in  which  no  complete  set  of  situation- 
response  rules  is  available  a  priori. 
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Abstract 

Analysis  of  animal  performance  can  provide 
important  cues  about  the  design  of  automated 
artificial  biomimetic  systems.  On  the  basis  of 
behavioral  observations,  we  have  been  developing 
models  of  dolphin  echolocation  ability  that  have 
applicability  to  the  design  of  biomimetic  sonar 
systems.  A  dolphin  was  trained  to  perform  an 
echolocation  delayed  matching-to-sample  task. 

The  clicks  the  animal  generated  during  task 
performance  were  recorded  and  digitized  along 
with  the  echoes  returned  by  the  stimulus  objects. 

The  dolphir*  performance  was  then  modeled 
using  artiScal  neural  networks. 

1.  Biomimetics 

The  study  of  animalt  can  provide  a  very  important 
adjunct  to  formal  analyses  in  the  design  of  automated 
systems  such  as  robots  and  autonomous  ^'ehicies. 
Animals  have  evolved  in  a  real  world,  solving  real 
problems,  such  as  gathering  and  interpreting  essential 
information.  Evolution  supports  the  emergence  of 
solutions  that  are  well  adapted  to  the  animal's  ecological 
niche,  but  provides  no  guarantee  that  the  evolutionary 
solutions  an  animal  derives  are  the  best  possible  solution 
to  a  given  problem  (sec  Gould  &  Lcwontin,  1979). 
Evolution  merely  asserts  that,  in  light  of  the  competing 
demands  presented  by  the  animal’s  evolutionary  history, 
its  ecology,  and  its  other  needs,  a  solution  (vis  a  vis  the 
whole  organism  and  all  its  adaptations  and  constraints)  is 


better,  or  at  least  no  worse,  than  any  other  that  its 
ancestors  had  achieved  (Roitblat,  1987). 

Although  formal  analyses  have  undoubtedly  been 
successful  in  developing  soludons  to  mat^  automatic 
process  problems  such  as  those  encountered  in  designing 
robots,  many  other  problems  have  resisted  solution. 
Solutions  to  scientific  and  engineering  problems  are 
inspired  by  many  sources,  but  are  uldmately  derived  &om 
the  intuition  of  the  engmeer,  as  formalizations  of  folk 
physics,  folk  psychophysics,  folk  psychology,  etc  Folk 
science  is  the  set  of  generally  held  beliefs  that  people 
employ  in  their  ordinary  activities.  For  example,  many 
college  students  recognize  that  a  ball  rolled  out  of  an 
mclmed  tube  will  fall  some  distance  in  front  of  the  tube. 
Most  of  these  same  students,  however,  often  mistakenly 
believe  that  if  they  drop  a  ball  while  walldng,  the  ball  will 
fall  directly  under  the  point  at  which  it  was  released 
(McQoskey,  1983;  see  alM  Holland,  Holyoak,  Nisbett,  & 
Thagard,  1986). 

Systems  involving  falling  balls  are  well  anal^-zed  so 
anyone  with  training  in  mechanics  can  see  clearly  the 
difference  between  the  folk  beUefs  concerning  falling  balls 
and  formal  scientific  beliefs  (we  may  call  these  'fact'^*). 
Our  intuitions  have  been  trained  to  correspond  with  the 
analyzed  facts,  rather  than  vrith  unanalyzed  apparent 
perceptions.  In  situations  in  which  formal  scientific 
analyses  have  not  yet  been  fully  applied,  we  have  no 
assurance  that  our  scientific  intuitions  similarly  avoid  the 
pitfalls  of  our  naivete.  A  creative  scientist  or  engineer 
will  apply  his  or  her  folk-science  intuition  to  a  difficult 
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Abstract 

This  paper  introduces  a  framework  for  ‘curious  neural 
controllers’  which  employ  an  adaptive  world  model  for 
goal  directed  on-line  learning. 

First  an  on-line  reinforcement  learning  algorithm  for 
autonomous  ‘animats’  is  described.  The  algorithm  is 
based  on  two  fully  recurrent  ‘self-supervised’  continually 
running  networks  which  learn  in  parallel.  One  of  the  net¬ 
works  learns  to  represent  a  complete  model  of  the  envi¬ 
ronmental  dynamics  and  is  called  the  ‘model  network’.  It 
provides  complete  ‘credit  assignment  paths’  into  the  past 
for  the  second  network  which  controls  the  animats  phys¬ 
ical  actions  in  a  possibly  reactive  environment.  The  an¬ 
imats  goal  is  to  maximise  cumulative  reinforcement  and 
minimize  cumulative  ‘pain’. 

The  algorithm  has  properties  which  allow  to  implement 
something  like  the  dtain  to  improve  the  model  network’s 
knowledge  about  the  world.  This  is  related  to  curios^ 
ity.  It  is  described  how  the  particular  algorithm  (as  well 
as  similar  model-builiing  algorithms)  may  be  augmented 
by  dynamic  curiosity  and  boredom  in  a  natural  manner. 
This  may  be  done  by  introducing  (delayed)  reinforcement 
for  actions  that  increase  the  model  network’s  knowledge 
about  the  world.  This  in  turn  requires  the  model  network 
to  model  its  own  ignorance,  thus  showing  a  rudimentary 
form  of  self-introspective  behavior. 

1.  Introduction 

In  the  sequel  first  an  on-line  algorithm  for  reinforcement 
learning  in  non-stationary  reactive  environments  is  de¬ 
scribed.  The  algorithm  heavily  relies  on  an  adaptive 
model  of  the  environmental  dynamics.  The  main  contri¬ 
bution  of  this  paper  (see  the  second  section)  is  to  demon¬ 
strate  how  the  algorithm  may  be  naturally  augmented 
by  curiosity  and  boredom,  in  order  to  improve  the  world 
model  in  an  on-line  manner. 

Consider  an  ‘animat’  whose  movements  are  controlled 
by  the  output  units  of  a  neural  network,  called  the  control 

‘This  work  was  supported  by  a  scholarship  Trom  SIEMENS  AG 


network,  which  also  receives  the  animat’s  sensory  percep¬ 
tion  by  means  of  its  input  units.  The  animat  potentially 
is  able  to  produce  actions  that  may  change  the  environ¬ 
mental  input  (external  feedback  caused  by  the  ‘reactive’ 
environment).  By  means  of  recurrent  connections  in  the 
network  the  animat  is  also  potentially  able  to  internally 
represent  past  events  (internal  feedback). 

The  animat  sometimes  experiences  different  types  of 
reinforcement  by  means  of  so-called  reinforcement  vntts 
or  pain  units  that  become  activated  in  moments  of  re¬ 
inforcement  or  ‘pain’  (e.g.  the  experience  of  bumping 
against  an  obstacle  with  an  extremity).  The  animat ’s 
only  goal  is  to  minimize  cumulative  pain  and  maximize 
cumulative  reinforcement.  The  animat  is  autonomous  in 
the  sense  that  no  intelligent  external  teacher  is  required 
to  provide  additional  goals  or  subgoals  for  it. 

^inforcement  units  and  pain  units  are  similar  to  other 
input  units  in  the  sense  that  they  possess  conventional 
outgoing  connections  to  other  units.  However,  unlike  nor¬ 
mal  input  units  they  can  have  desired  activation  values  at 
every  time.  For  the  purpose  of  this  paper  we  say  that  the 
desired  activation  of  a  pain  unit  is  zero  for  all  times,  other 
reinforcement  units  may  have  positive  desired  values.  In 
the  sequel  we  assume  a  discrete  time  environment  with 
‘time  ticks’.  At  a  given  time  the  quantity  to  be  minimized 
by  the  learning  algorithm  is  ~  Vi'(O)*  where  yi{t) 

is  the  activation  of  the  tth  pain  or  reinforcement  unit  at 
time  t,  t  ranges  over  all  remaining  time  ticks  still  to  come, 
and  Ci  is  the  desired  activation  of  the  tth  reinforcement 
or  pain  unit  for  all  times. 

The  reinforcement  learning  animat  faces  a  very  gen¬ 
eral  spatio-temporal  credit  assignment  task:  No  external 
teacher  provides  knowledge  about  e.g.  desired  outputs  or 
‘episode  boundaries’  (externally  defined  temporal  bound¬ 
aries  of  training  intervals).  In  the  sequel  it  is  demon¬ 
strated  how  the  animat  may  employ  a  combination  of 
two  recurrent  self-supervised  learning  networks  in  order 
to  satisfy  its  goal. 

Munro  [2],  Jordan  [I],  Werbos  [12],  Robinson  and  Fall- 
side  [6],  and  Nguyen  and  Widrow  [4]  used  ‘model  net¬ 
works  ’  for  constructing  a  mapping  from  output  actions 
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Abstract 

In  this  paper  it  is  argued  that  studying  the  behaviour 
of  (even  simple)  animals  and  human  beings  from  an 
Ethologist’s  point  of  view  will  provide  a  basis  for 
understanding  human  cognition.  A  simple  model 
of  the  organisation  of  behavioural  sequences  in  ani¬ 
mals  as  described  by  Niko  Tinbergen  [1]  and  Konrad 
Lorens  [2]  is  presented  as  an  starting  point  to  de¬ 
velop  intelligent  iatonomoua  systems.  The  relation 
to  current  research  into  behaviour-based  Robotics 
is  shown  (cf.  Brooks  [3])  and  essential  extensions 
to  the  behavioural  model  such  as  optimisation  pro¬ 
cedures  based  on  genetic  algorithms  and  evolution 
technology,  a  framework  to  link  basic  sensory-motor 
skills  to  higher-order  categorical  perception  as  de¬ 
scribed  by  the  symbol  grounding  problem  and  more 
advanced  models  of  the  organisation  of  behavbur 
are  presented.  In  particular,  self-organisational  pro¬ 
cesses  are  advocated  as  a  key  feature  in  achieving 
intelligent  behaviour  of  autonomous  robots. 

1  Introduction 

Initially,  Robotics  has  been  the  most  attractive  field 
for  researchers  in  .Artificial  Intelligence  (.AI)  in  order  to 
study  the  whole  range  of  cognitive  capabilities  of  human 
beings.  Defined  as  the  intelligent  connection  between 
sensing  and  acting,  Robotics  was  supposed  to  naturally 
pose  the  questions  one  has  to  answer  in  order  to  under¬ 
stand  human  intelligence. 

Researchers  focussed  on  various  aspects  of  the  sensing-to- 
acting  chain  using  a  knowledge-based  approach  to  specify 
the  information  necessary  for  the  robot  to  perform  vari¬ 
ous  tasks.  In  general,  this  information  consists  of  a  world 
model  and  the  knowledge  about  possible  transitions  from 
one  world  state  into  another  due  to  the  actions  of  such 
an  intelligent  agent. 


Since  a  robot  is  usually  located  in  an  unstructured  en¬ 
vironment,  world  modelling  requires  us  to  impose  struc¬ 
ture  on  this  environment  using  an  explicit  description 
of  the  objects  surrounding  the  robot.  The  problem  of 
which  aspect  of  the  environment  should  be  modelled  and 
which  aspects  of  the  robot-world-interaetions  should  re¬ 
main  constant  or  not  became  one  of  the  most  important 
questions  in  this  domain,  well-known  as  the  Frame  prob¬ 
lem  [4]. 

Another  problem  which  arises  when  using  a  world  model 
is  the  precision  of  the  robot’s  internal  representation  of 
the  surrounding  world.  Since  the  control  algorithms  of 
today’s  robots  are  based  on  the  transformation  of  three- 
dimensional  coordinates  in  order  to  perform  the  required 
actions,  the  internal  representation  of  the  world  which 
serves  to  interpret  sensor  data  and  to  choose  the  appro¬ 
priate  actions  (the  representation  of  which  in  turn  has  to 
be  translated  into  sequences  of  three-dimensional  trajec¬ 
tories  of  the  robot’s  actuators)  must  be  precise  in  order 
to  allow  an  exact  mapping  from  the  external  world  to 
the  internal  representation  and  vice  versa. 

The  data  received  from  the  sensors  is  used  to  update  the 
internal  world  model.  But  as  it  is  difficult  to  gain  pre¬ 
cise  three-dimensional  description  from  currently  avail¬ 
able  sensors,  it  was  claimed  that  the  current  sensor  tech¬ 
nology  was  not  sufficient  to  deliver  precise  updates  of  the 
model. 

Because  of  this  lack  of  information,  it  is  argued,  robots 
are  not  able  to  perform  appropriate  and  flexible  actions. 
Once  more  precise  sensors  hav'e  been  developed,  the 
problem  of  incorporating  more  precise  world  knowledge 
into  the  internal  representation  could  be  solved  more  eas¬ 
ily. 

We  do  not  agree  with  this  opinion  and  are  convinced  that 
the  inflexibility  and  subtlety  of  current  robot  controllers 
are  inherently  based  on  the  approach  chosen.  Making  a 
decision  in  favour  of  such  a  knowledge-based  approach  to 
world  modelling  naturally  leads  to  the  problems  discov- 
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Abstract 

Certain  parallels  are  explored  between  the  mechanism 
of  associative  learning  in  animals  -  and  its  description  by 
contemporary  learning  theories  such  as  that  of  Rescoria 
and  Wagner  (1972)  >  and  object  classification,  which  can 
be  construed  as  the  ivoblem  of  learning  to  associate 
visual  features  or  micro-fearures  with  a  category.  A 
specific  associative  learning  theory  of  classification  is 
presented.  While  prototype  effects  can  be  easily  accom¬ 
modated  within  associative  theories  of  classification. 
exemplar  effects  appear  to  be  fatal  for  such  accounts, 
since  no  explicit  representations  of  exemplars  arc  stored 
in  associative  networks.  An  experiment  is  reported  whk;h 
attempted  to  see  whether  the  model  would  be  able  to 
reprove  exemplar  effects.  Surprisingly,  it  could.  In  exa¬ 
mining  why  this  was  the  case,  a  new  interpretation  of 
exemplar  effects  emerged. 

1.  Introduction 

Some  yean  ago,  it  was  very  fashionable  to  assume  that 
people  represented  categories  of  visual  objects  in  terms  of 
their  prototypes.  As  a  result  of  learning  that  a  large 
numbn  of  difleient  visual  stimuli  ail  belong  to  a  certain 
category  -  such  as  the  category  dog  •  people  were 
assumed  to  have  extracted  a  prototype  representing  the 
central  tendency  of  the  stimuU  on  a  variety  of  feature 
dimensions. 

Figure  1  (left-hand  panel)  illustrates  how  this  abstrac¬ 
tion  is  supposed  to  work.  Imagine  a  number  of  exemplars 
(marked  by  X’s)  of  a  category  which  vary  on.  say.  two 
dimensions.  For  instance,  the  exemplars  could  be  dogs 
varying  in  colour  and  size.  According  to  prototype 
theses,  what  is  actually  extracted  and  mentally 
represented  of  this  category  is  the  central  tendency  of  the 
exemplars  within  the  feature  space.  The  prototype 
^marked  by  a  dot)  has  a  value  on  each  dimension 
corresponding  to  the  modal  value  of  the  actual  exemplars 
on  that  dimension. 

What  happens  when  a  new  stimulus  is  presented  for 


classification  ?  According  to  {nototype  theories,  the  simi¬ 
larity  of  the  test  stimulus  to  the  prototype  is  determined  - 
this  being  just  the  inverse  of  the  distance  between  them  - 
and  the  stimulus  is  classified  as  being  a  member  of  this 
category  if  its  similarity  to  the  dog  prototype  is  greater 
than  its  similarity  to  any  other  category  prototypes.  The 
greater  the  similarity  to  the  prototype,  tte  faster,  more 
confident,  or  more  accurate  the  classification  decision. 

The  crucial  commitmettt,  then,  of  prototype  theories  is 
that  during  the  process  of  classification,  no  representations 
of  specific  training  exemplars  play  a  role  in  classification. 
It  is  only  the  abstracted  prototype  which  determines  the 
course  ^  classification.  Of  course,  rqnesentations  of 
specific  training  exemplars  clearly  do  exist:  thus  each  of 
us  has  memory  traces  of  specific  faces  that  we  are  fami¬ 
liar  with  -  our  mother’s  face,  for  example.  Prototype 
theories  maintain,  however,  that  when  a  new  physic^ 
stimulus  appears  before  me  and  I  decide  that  it  is  a  face, 
the  trace  th^  exists  in  memory  of  my  mother’s  face  plays 
no  tok  in  this  process.  It  is  only  the  prototypical  face  that 
I  have  abstracted  from  all  the  faces  that  I  have  ever 
experienced  that  determines  my  classification  decision. 

Now,  there  is  one  piece  of  empirical  evidence  which 
appears  to  provide  strong  encouragement  for  the  prototype 
view,  and  this  is  the  fact  that  subjects  often  tesp^  more 
accurately,  or  with  greater  confidence,  to  the  prototype  of 
a  category  than  they  do  to  the  qtecific  training  exemplars, 
even  though  they  may  never  have  seen  the  prototype 
before.  A  clear  exampk  of  this  occurs  in  an  experiment 
by  Knapp  and  Ander^  (1984).  They  generated  training 
stimuli,  which  were  dot  patterns,  by  distorting  a  particular 
prototype  pattern.  Subjects  saw  a  number  of  distortions  of 
the  prototype,  and  learned  correctly  to  classify  all  of  these 
patterns.  When  they  were  subsequently  tested  either  with 
new  distortions,  the  original  training  patterns,  or  with  the 
prototype  itself,  they  responded  most  accurately  to  the 
prototype  (I  shall  call  this  the  prototype  effect).  This  was 
most  evident  when  the  number  of  different  training 
stimuli  used  had  been  large.  Of  course,  this  result  is  pre¬ 
cisely  what  one  would  expea  on  a  prototype  account, 
since  the  prototype  pattern  conespo^  to  what  the 
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Abstract 

This  paper  gives  details  oi  recent  experiments  to 
determine  the  characteristics  oi  two  simple 
mediation  strategies.  We  have  chosen  to  com¬ 
pare  two  mutual  inhibition  strategies:  the  standard 
'rvflop*  model  [Ludlow  TSlShackieiord  89L  and  a 
scheme  suggested  by  McFarland  [McFarland  65]. 

To  study  the  comparative  features  of  each 
network  we  use  TAG'S  'non-spiking*  neuron 
model  (  SNF  block  model  [Snaith  89alSnaith 
89bKHolland  &  Snaith  90a]  )  both  in  simulation 
and  on  one  of  our  Hilda  series  of  mobile  robots 
to  show  the  mediation  between  simple  fight'  and 
'flighf  behaviours  set  up  in  conflict. 

1.  introduction 

A  number  of  problems  in  animal  behaviour  centre 
around  the  situation  where  there  are  a  number  of 
stimuli  present  and  a  number  of  appropriate  respon¬ 
ses  possible,  only  one  of  which  can  take  place  at  any 
time.  Which  response  (or  behaviour)  will  occur?  and 
when,  and  under  what  circumstances,  will  it  be 
supplanted  by  another?  in  behavounai  analysis 
variations  in  response  have  been  attributed  to  set, 
fatigue,  habituation,  attention,  displacement,  and  so 
on.  At  the  physiological  level,  mecnamsms  have 
been  proposed  for  mediating  the  interactions  of  a 
number  of  individual  neurons,  or  a  number  of  pools 
of  neurons,  each  of  which  controls  a  behaviour,  so 
that  one  or  another  is  temporarily  dominant. 

This  paper  examines  the  consequences  of  making 
some  very  simple  assumptions  about  the  require¬ 
ments  which  the  mechanism  must  satisfy,  and 
examines  two  possible  implementations  on  paper,  on 
the  bench,  and  in  a  mobile  robot. 


2.  Conflict  and  Resolution 

We  have  already  assured  that  the  situation  is  one  in 
which  several  stimuli  are  present.  Let  us  further 
assume  that  each  stimulus  is  linked  to  a  separate 
behaviour,  which  is  appropriate  to  the  stimulus.  For 
example,  there  may  be  food  and  water  present,  and 
behaviours  for  eating  and  drinking  to  which  they  are 
linked.  The  first  step  in  considering  any  restrictions  on 
the  nature  of  any  controlling  mechanism  might 
usefully  be  to  Imagine  that  no  control  mechanism  is 
present.  What  will  happen  it  the  behaviours  occur 
simullaneousiy?  There  appear  to  be  three  main 
possibilities; 

a) .  The  behaviours  operate  satisfactorily  together  eg; 
I  can  walk  and  chew  gum  at  the  same  time. 

b) .  The  behaviours  fail  to  operate  satisfactorily 
together  because  they  require  effectors  to  carry  out 
incompatible  actions  -  eg;  I  cannot  eat  and  roar  at  an 
enemy  simultaneously. 

c) .  The  behaviours  fail  to  operate  satisfactorily 
together  because  the  effects  of  one  behaviour  -  eg: 
flight  -  make  it  impossible  to  continue  with  another  - 
eg:  mating. 

Let  us  confine  our  attention  to  situations  (b)  and  (c). 
While  it  is  true  that  (c)  may  bring  a  sort  of  resolution 
to  the  conflict  between  the  two  behaviours,  it  is  worth 
noting  that  there  is  no  guarantee  of  tNs.  An  oscillation 
between  the  two  behavious  might  occur  on  an 
arbitrary  time  scale  (flee  for  a  second,  mate  for  a 
second).  This  could  also  occur  in  (b),  ar>d  might  even 
in  some  circumstances  be  an  effective  strategy, 
amounting  to  time  sharing  [McFarland  73 
[McFarland  74].  However,  both  the  robot  designer 
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Abstract 

The  p^)er  sets  the  first  steps  towards  a  theory  of  emergent 
functionality.  The  theory  tries  to  make  explicit  what  emer- 
gent  fiinctitxulity  is  by  contrasting  it  with  hierarchical 
funcdtmality.  It  analyzes  the  principle  advantages  of  this 
approach  and  conjectures  a  formal  structure  common  to 
systems  widi  emergent  functionality. 

1.  Introduction 

Emergent  functionality  means  that  a  function  is  not 
achieved  directly  by  a  component  or  a  hierarchical  system 
of  components,  but  indirectly  by  the  interaction  of  more 
primitive  components  among  themselves  and  with  the 
world.  Emergent  functionality  has  become  one  of  the  main 
themes  in  research  on  Artificial  Life  (Langton,  1988)  and 
autonomous  agents  (Brooks,  1989).  So  far,  engineers  and 
scientists  have  used  their  intuitions  to  build  systems  that 
exhibit  emergent  functionality  but  there  is  no  explicit 
theory  yet  on  what  emergent  functionality  is,  how  it  can 
be  achieved,  when  it  is  r^prc^riate  and  why.  This  paper 
reports  on  research  to  understand  the  principles  under:>'ing 
emergent  functionality  and  how  it  can  be  used  for  design* 
ing  and  building  systems.  We  first  discuss  hierarchical 
systems  to  make  the  specific  properties  of  systems  with 
emergent  functionality  stand  out  more  clearly.  Then  we 
nun  to  emergent  functionality  itself.  Exarr.,-!es  are  dis¬ 
cussed  and  the  advantages  of  emergent  funcuonality  are 
analyzed.  The  final  part  of  the  paper  conjecmres  a  formal 
structure  that  seems  common  to  systems  emergent 
functionality. 


2.  Hierarchical  systems 

2.1.  Characterization  of  hierarchical  systems 

In  hierarchical  systems  there  is  a  direct  relationship 
between  structure  and  function.  The  system  consists  of  a 
set  of  comprments.  Each  of  these  con^wnents  has  three 
aspects:  (1)  inputs  and  ouqnits,  (2)  a  control  element  to 
turn  the  comprment  on  or  off,  (3)  a  well-determined  func¬ 
tionality.  A  component  stands  on  its  own  in  the  sense  that 
its  functionality  can  be  tested  independently  fitom  the 
other  components.  Moreover  this  functionality  is  a  recog¬ 
nizable  subfuQction  of  the  global  functionality  of  the  sys¬ 
tem.  For  example,  the  motor  of  a  car  needs  fud.  The  tank 
realizes  a  subfunction  of  fuel  supply,  namely  to  bold  the 
fuel  The  tank  realizes  this  function  independently  fitom 
the  other  components  like  the  pipes  that  transfer  fuel  to 
the  motor  or  the  accelerator  that  regulates  the  fiow. 

Some  components  are  specialized  in  obtaining  input 
firom  the  environment  Others  are  concerned  widi  output 
and  actions  in  the  environment  There  are  also  com¬ 
ponents  whose  major  role  is  the  control  of  the  operation 
of  other  components.  So  the  different  componentt  interaa 
in  two  ways:  (1)  There  is  flow  of  data  through 
inpu^output  relations  between  components.  (2)  There  is 
flow  of  control  when  one  component  turns  another  com¬ 
ponent  on  or  off. 

Because  the  components  function  independently  of 
each  other  they  can  be  constructed  and  put  together  in  a 
modular  fashion:  The  global  functionality  is  decomposed 
into  different  subfunctions.  A  subfunction  is  either  directly 
realized  by  a  particular  component  <»'  it  is  further  decom- 
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Abstract 

In  the  first  part  of  this  paper  I  argue  that  the 
learning  problem  facing  animats  is  essentially 
that  which  has  been  studied  as  the  reinforce¬ 
ment  learning  problem — the  learning  of  be¬ 
havior  by  trial  and  error  without  an  explicit 
teacher.  A  brief  overview  is  presented  of  the 
development  of  reinforcement  learning  archi¬ 
tectures  over  the  past  decade,  with  references 
to  the  literature. 

The  second  part  of  this  paper  presents  Dyna, 
a  class  of  architectures  baaed  on  reinforce¬ 
ment  teaming  but  which  go  beyond  trial-and- 
ertor  learning.  Dyna  ar<^tecturea  include  a 
teamed  internal  model  of  the  world.  By  in¬ 
termixing  conventional  trial  and  error  with 
hypothetical  trial  and  error  using  the  world 
model,  Dyna  systems  can  plan  and  learn  opti¬ 
mal  bdavior  very  rapidly.  Results  are  shown 
for  simple  Dyna  systems  that  learn  from  trial 
and  etfor  while  they  simultaneously  learn  a  _ 
world  model  and  use  it  to  plan  optimal  action 
sequences.  We  also  show  that  Dyna  architec¬ 
tures  an  easy  to  adapt  for  use  in  changing 
environments. 

1  Animats  and  the  Reinforcement 
Learning  Problem 

What  is  an  Animat?  An  animat  is  an  adaptive  system 
designed  to  operate  in  a  tight,  closed-loop  interaction 
with  its  environment.  An  animat  need  not  be  a  learn¬ 
ing  system,  but  often  it  b;  some  sense  of  adaptation  of 
b^avim  to  variations  in  the  environment  is  required. 

Figum  1  is  a  representation  of  the  animat  problem 
as  I  see  it.  On  some  short  time  cycle,  the  animat  re¬ 
ceives  sensory  information  from  the  environment  and 
chooses  an  action  to  send  to  the  environment.  In  ad¬ 
dition,  the  animat  receives  a  special  signal  from  the 
environmoit  called  the  reward.  Unlike  the  sensory  in¬ 
formation,  which  may  be  a  large  feature  vector,  or  the 
action,  which  may  also  have  many  components,  the  re- 
ward  is  a  single  real-valued  scalar,  a  number.  The  goal 
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Figure  1.  The  Reinforcement  Learning  Problem  fac¬ 
ing  an  Animat.  The  goal  is  to  maximize  cumulative 
reward. 


of  adaptation  is  the  maxiinisation  of  the  cumulative 
reward  received  over  time. 

This  formulation  of  the  animat  problem  is  the  same 
as  that  widely  used  in  the  study  of  rem/oreement  !eam- 
inf.  In  fact,  reinforcement  leanung  systems  can  be  de¬ 
fined  as  learning  systems  designed  for  and  that  perform 
well  on  the  animat  problem  as  described  above.  Infor¬ 
mally,  we  define  reinforcement  learning  as  learning  by 
trial  and  error  from  performance  feedback — i.e.,  from  ' 
feedback  that  evaluates  the  behavior  generated  by  the 
animat  but  does  not  indicate  correct  behavior.  In  the 
next  section  we  briefly  survey  reirdbrcement  learning 
architectures. 

One  might  object  to  the  problem  formulation  in 
Figure  1  on  the  grounds  that  all  possible  goals  have 
been  reduced  to  a  scalar  reward.  Although  this  appears 
limiting,  in  practice  it  has  proved  to  be  a  useful  way 
of  structuring  the  problem.  Some  examples  of  goals 
formulated  in  this  way  are: 

•  Foraging:  Reward  is  positive  for  finding  food  ob¬ 
jects,  negative  for  energetic  motion,  slightly  nega¬ 
tive  for  standing  still. 

•  Pole-balancing  (balancing  a  pole  by  applying 
forces  to  its  base):  The  reward  is  zero  while  the 
pole  is  balanced,  and  then  becomes  -1  if  the  pole 
falls  over  or  if  the  base  moves  too  far  out  of 
bounds. 

•  Towers  of  Hanot:  Reward  is  positive  for  reaching 
the  goal  state. 
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If  we  wish  a  robot  to  behave  adaptively,  it 
may  be  useful  to  design  it  according  to 
principles  enabling  animals  and  people  to  do  so. 
We  use  brain  damage  and  drugs  in  animals  to 
break  behavior  down  into  its  subcomponents, 
and  stages  of  recovery  to  reveal  levels  of 
reintegratioiL  By  partial  transection  in  the 
brain,  or  by  drugs,  we  appear  temporarily  to 
inactivate  central  motor  programs  involved  in 
spontaneous  behavior  (20).  As  in  earlier  work 
using  complete  transection  ( 19),  behavior  breaks 
down  into  reflexes;  spontaneous 
environmentally-directed  behavior  is  absent 
But  by  our  procedures,  allied  reflexes  operate 
as  intermediate-level  submodules.  By  studying 
how  they  interact  in  recovery,  and  by  analyzing 
controls  over  individual  reflexes,  some  principles 
emerge,  perhaps  useful  for  robot  design.  In 
addition,  approaching  the  subject  via  phenomena 
produced  by  pathology  may  yield  insight  into 
imperfections  in  a  robot  that  may  arise  from 
imbalance  in  adaptive  systems. 

(1)  Allied  Reflexes  can  Act  as  Isolated. 
Adaptive  Submodules 

^tensive  lateral  hypothalamic  (LH)  damage 
produces  a  simpliHcation  of  motivated  behavior, 
that  we  have  called,  after  Magnus  (10),  a  "zero- 
condition".  All  spontaneous  environmentally 
oriented  behavior  is  temporarily  abolished.  The 
animal  lies  motionless,  virtually  comatose. 
However,  its  autonomic  system  remains 
relatively  intact  ~  if  such  an  animal  is  tube-fed, 
it  lives  and  recovers  (20).  Within  a  couple  of 
days,  somnolence  usually  disappears,  but  the 
animal  remains  for  several  days  in  a  state  of 
catalepsy  and  akinesia,  symptoms  often  seen  in 
Parkinson’s  disease  (16). 

For  instance,  an  LH-damaged  cat  remains 


for  many  minutes  with  its  forelimbs  spread 
widely  apart,  or  one  foreleg  placed  up  on  its 
back  at  quite  an  extreme  angle.  Similarly,  a 
cataleptic  rat  remains  with  its  hindlegs  on  a 
raised  platform,  forelegs  on  the  floor,  in  an 
awkward  downward-tilted  posture.  These 
symptoms  support  the  generally  held  view  that 
dopamine-deficiency  produces  an  inability  to 
initiate  movement  It  therefore  seems 
paradoxical  when,  as  shown  in  fig.  1,  a  rat  made 
cataleptic  by  the  dopamine  receptor  blocker 
haloperidol  will,  if  pushed  from  behind,  leap 
vigorously  into  the  air  (2, 12).  This  paradox  is 
resolved  by  detailed  analysis  of  the  rat’s 
responses  leading  to  the  jump.  As  the  caudeptic 
animal  is  pushed  forward  (Fig.  lA),  it  braces 
against  such  displacement  Ity  shifting  its  weight 
backwards.  When  its  hind  legs  begin  to  slip 
(B),  a  leap  is  triggered  (C  and  D),  away  from 
the  surface  where  it  is  unstable.  >^en  it  lands 
on  the  horizontal  table  top,  it  immediately 
resumes  immobility.  Thus,  cataleptic  leaping  is 
merely  an  allied  postural  support  defensive 
reflex,  triggered  by  postural  instability.  The 
animal  does  not  suffer  from  a  general  inability 
to  initiate  movement  -  its  isolated  support 
submodule  simply  does  not  do  so  when  the 
animal  is  in  a  stable  posture,  even  if  awkward. 

The  static  postural  support  submodule  consists 
of  an  aggregate  of  allied  refines  (including 
standing,  crouching,  bracing  dinging,  stability- 
related  stepping,  righting,  and  jumping),  ail  of 
which  homeostatii^ly  maintain  support  or 
regain  upright  unmoving  stability:  It  is  isolated 
from  other  submodules,  individually  involved  in 
locomotion,  turning,  head-scanning,  orienting, 
and  ingestion,  which  are  inactivated  (20).  When 
a  submodule  is  isolated,  inhibitory  controls  are 
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Abstract 

&i  this  paper  we  described  a  simplified  model  for  a  case  of 
functional  self-organizaacm.  It  de^  with  the  emergence  of  a 
particular  form  ot  task  assignment  and  parallel  hierarchical  or- 
ganizadon  within  a  social  group  which  depend  basically  on  the 
interactions  occurring  between  individuals  and  with  their  im¬ 
mediate  local  surroundings.  The  task  organization  within  the 
colony  at^teared  to  be  a  distributed  function  which  does  not 
require  tte  presence  of  an  individualized  central  organizer.  We 
disaissed  bow  such  elementary  processes  could  potentially  be 
^lied  in  the  coordination  and  self-organization  of  groups  of 
iiueracting  robots  with  simple  local  computational  properties 
to  perform  a  wide  range  of  tasks. 

Introduction 

la  eusocial  insea  socieo'es,  all  the  individuals  must  cooperate 
to  perform  a  certain  number  of  tasks  the  nature  of  which 
depiods  on  the  internal  needs  of  the  coltmy,  as  well  as  on  the 
particular  environmental  conditions.  At  any  time  each 
in^vidual  can  an  and  interact  either  with  other  individuals  or 
with  their  environment  and  thus  causes  changes  in  the  state  of 
the  pwp.  The  soup  is  nevertheless  the  focal  poim  for  a 
stable,  self-regulated  organization  of  individual  behaviors. 
The  snidy  of  the  processes  leading  to  the  emergence  a 
stable  collective  order  in  insea  societies  has  recently 
emphasized  the  importance  of  individual  interaction  dynamics 
(Deneubourg  et  al.,  1987;  Deneubourg  and  Goss,  1989;  Goss 
et  al.,  1990;  Beckers  et  al.,  1990).  This  research  has  demons¬ 
trated  that  quite  sin^  elementary  rules  of  individual 
behavior  ofta  make  it  possible  for  the  society  to  create 
surprisingly  complicared  patterns  and  to  make  efficient 
decisions  when  certain  types  of  external  constraints  are 
encountered. 

Our  biological  study  examined  the  processes  involved  in 
task  assignment  in  primitive  Poiistes  wasp  colonies  Cnterau- 
laz  et  al.,  19V0  a,  b  and  c).  Poiistes  domirudus  is  a  species 
which  is  to  be  found  in  temperate,  northerly  regions.  These 


wa^  had  two  advantages  from  the  point  of  view  of  our  study, 
since  they  build  their  nest  with  no  envelope  and  thdr  colonies 
contain  a  relatively  small  number  of  individuals  (n  «  20). 
which  makes  it  possible  to  observe  all  the  members  of  a  colo¬ 
ny  over  a  period  of  time.  These  primitively  eusocial  species 
have  little  individual  differemiatioa  and  no  moqpbological 
differences  between  castes  or  predetermined  conortti  of 
activities  dq)ending  on  age  or  on  any  other  known 
physiological  predetermination.  The  integntion  and 
coodination  individual  activities  therefore  depend  largely 
on  the  interactions  which  take  place  among  the  members  of 
these  societies  and  on  the  immediate  rdationships  between  a 
society  and  its  environment. 

A  two  fold  morphogenetic  process  thus  occurs  within 
the  colony : 

•  Each  individual  acquires  iu  own  behavioral  profile 
which  is  characterized  by  all  the  observable  suble  behavioral 
items  in  which  it  takes  pan.  All  the  profiles  can  be  described 
by  a  reduced  number  of  behavionl  forms  to  wtuch  the 
various  individual  profiles  belong  (see  Tberaulaz  et  coU..  1990 
a). 

•  In  a  society  at  a  given  moment  in  time,  the  whole  sa  of 
individual  behavioral  profiles  does  not  constitute  a  random 
sample  of  all  the  possible  profiles;  they  constitute  a  profile 
conngurstion  which  can  be  defined  by  the  proportion  of 
individuals  belmiging  to  each  oS  the  behavioral  forms. 

The  way  this  configuration  is  controlled  by  internal  and 
external  constraints  on  the  colony  constitutes  the  task 
assignment  process.  The  model  presented  here  aims  at 
describing  the  task  assignment  process  in  s  hierarchically 
structured  society. 


ETHOLOGICAL  AND  PSYCHOLOGICAL  MODELS  OF 
MOTIVATION-  TOWARDS  A  SYNTHESIS 


Frederick  Toates, 

Biology  Dq»rtiiimt, 

The  (5pen  University. 

Milton  Keynes 
MK7  6AA. 

United  Kingdom. 

Abstract 

By  looking  at  a  variety  of  individual 
motivationid  systems,  the  first  steps 
towards  reconciliation  of  models  in  the 
psychological  and  Lorenz  traditions  is 
made.  A  model  that  contains  features 
of  both  emerges.  The  relevance  of  this 
model  to  animal  welfare  issues  is 
discussed. 

1.0  Introdution 

It  seemed  once  that  study  of  motivation  theory  was  in 
serious  decline;  in  psychology  and  ethology,  the  days  of 
the  grand  theories  (e.g.  Hull.  Lorenz)  appeared  to  be 
over  (Toaies.  19^.  Psychologists  lost  interest  in  the 
topic  and  ethologists  had  moved  in  their  droves  to  the 
greener  pastures  of  optimal  foraging  and  sociobiology. 
Then  applied  ethology  came  to  the  rescue;  a  renewal  of 
interest  in  motivation  theory  was  prompted  by 
crmsiderations  of  animal  welfare,  suffering  and  the 
associated  recommendations  for  legislation  (Dawkins. 
1980).  To  assess  when  an  animal  is  suffering  or  why 
anim^  in  captivity  might  spend  much  time  in  various 
bizarre  activities,  not  observed  in  wild-living 
conspecifics.  one  needs  motivation  theory.  However, 
applied  ethologists  encountered  a  major  difficulty: 
fragmentation  and  contradictions  in  the  literature.  To 
some  extent,  psychological  and  ethological  approaches 
differ  in  their  assumptions  and  areas  of  interest 
However,  differences  do  not  divide  neatly  along  party 
lines.  Thus,  ironically,  some  writers  in  the  psychological 
tradition  (c.g.  Gallistel.  1980;  Glickman  and  Schiff. 
1967;  Hermstein.  1977;  Hogan,  1967;  Toaies.  1986) 
argued  for  the  applicability  of  the  best-known 
ethological  model,  that  of  Lorenz  (1950),  whereas,  to 
some  ethologists,  it  has  been  used  for  little  more  than 
to  illustrate  how  motivation  doesn't  work  (Dawkins, 
1986;  Archer,  1988).  The  issue  is  more  than  of  academic 
interest;  recommendations  for  animal  husbandry  depend 
upon  which  model  one  believes  (Baxter.  1983;  Hughes 
and  Duncan,  1988a,b). 

Some  of  the  conflict  seems  to  arises  from 
semantic  confusimi,  loose  use  of  language  and  the 
assumptions  of  different  perspectives,  and  we  attempt  to 
resolve  some  of  this.  We  suggest  where  common  ground 
can  be  found.  FirsL  it  is  necessary  to  try  to  sanitize  the 
vocabulary. 
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1.1  The  meaning  of  motivation 

One  reason  for  confusion  seems  to  be  the  meaning 
different  authors  assign  to  the  concept  of  "motivation". 
Even  within  psychology,  the  concept  is  used  in  very 
different  contexts  and  meanings  by  different  researchers 
(Toates,  1986).  In  psychology,  "drive"  and  "motivation" 
are  overlapping  and  sometimes  identical  concepts,  while 
in  ethology,  since  the  influential  work  of  Hinde  (1959; 
1960).  "drive"  has  assumed  a  role  much  like  a  swear¬ 
word  in  church,  while  "motivation"  has  survived  as  a 
well-respected  term.  The  term  "motivation”  is  found  in 
most  ethology  books,  but  rarely  defined  in  a  way  that 
enables  one  to  know  what  is  meant  by  claiming  that  an 
animal  is  more  or  less  motivated  for  certain  behaviour. 
Although  mainly  used  to  discuss  how  responses  may  vary 
in  different  contexts,  "motivation"  is  usually  only  a 
heading  for  the  chapter  in  which  this  is  found.  For 
example.  Slater  (198S)  explicitly  takes  this  broad  view, 
in  saying,  while  discussing  the  variability  of  re^xxtses  in 
aninuds:  "...discovering  just  what  it  is  that  leads  them 
to  behave  differently  from  one  time  to  another  presents 
some  very  interesting  pfoblems...These  are  the  i»oblems 
of  motivation  or,  in  o^er  words,  the  mechanisms 
leading  animals  to  do  what  they  do  when  they  do  it" 
McFarland  and  Sibly  (1975)  provided  a  formal 
representation  of  motivation,  the  state-space  approach. 
Although  primarily  introdu^  as  a  powerful  ^tentative 
to  unitary  drive  concepts,  they  also  tried  to  tidy  up  the 
terminology.  Thus,  "motivational  state"  is  the  state  value 
of  all  causal  factors  influencing  a  setup  of  functionally 
related  behaviour  patterns.  The  motivational  state  maps 
onto  a  "tendency'^hich  is  the  strength  of  the  behaviour 
in  the  competition  for  the  motor  apparatus  (the  "final 
common  path"). 

This  may  be  a  convenient  and  fairly  unambiguous 
way  of  talking  about  motivation,  but  for  many 
psychologists,  it  doesn't  relate  to  their  conception  of 
motivatit^  problems.  In  the  terminology  of 
McFarland  and  Sibly,  all  behaviour  of  any  animal  will 
by  definition  be  guided  by  a  motivational  state, 
determining  its  tendency.  This  leads  to  the  danger  of  a 
circular  argumenL  since  by  the  definition,  an  animal 
will  always  be  undCT  the  control  of  the  behaviour  for 
which  the  tendency  is  strongest,  i.e.  for  which  it  is  most 
motivated.  So.  if  all  behaviour  reflects  a  hypothetical 
motivational  state,  which  can  only  be  deduct  from 
observations  of  the  behaviour  performed,  we  run  into 
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Abstract 

We  consider  psychology  as  the  study  of  adaptive 
agency,  investigating  the  processes  and  mechanisms 
lesultiog  in  fio^-increasing  behavior  in  the  world. 
A  cential  issue  in  psychology  so  construed  becomes: 
what  are  the  relations  between  the  prinuuy  adaptive 
process  of  evolution  by  natural  selection,  and  the 
adtfKive  processes  psy^logists  call  ‘learning*?  In 
panicalar.  under  whM  conditions  would  learning 
evolve?  To  explore  this  issue,  we  use  genetic  algo¬ 
rithms  to  siniulM  the  evolution  by  natural  selection 
of  neural  networks,  which  in  turn  control  the 
behavior  of  simple  creatures  in  virtual  environments. 
We  have  developed  what  we  consider  the  simplest 
possible  envaonmental  challenge  in  which  unsuper¬ 
vised  associative  learning  could  prove  adaptive: 
‘boottoappini’  the  learned  use  of  one  highly  accu¬ 
rate,  but  individuaUy  varying,  sensory  modality  by 
another  less  accurate,  but  evolutionarily  stable, 
modalky.  We  have  found  a  possibly  quite  general 
U-shqml  curve  relating  the  dme  (in  number  of  gen- 
eradans)  to  evolve  the  use  of  unsupervised  learning 
on  the  vaqrmg  'bootauapped*  modality,  (o  the  accu> . 
racy  of  pereeptkm  in  tire  stable  modality  which 
guides  this  learning.  This  U-shaped  curve  appears  to 
tepreseni  a  tiade-off  between  the  adaptive  pressure 
to  evolve  learning  (which  peaks  when  perception  ac¬ 
curacy  in  the  st^e  guiding  modality  is  at  chance) 
and  ease  of  learning  during  a  given  lifespan 
(which  peaks  when  this  accuracy  is  perfect.) 


1  Introductidi 

Natural  selection  has  constructed  animals'  minds  and 
behavion  for  adaptive  fit  to  the  environmenul  problems 
they  must  tea.  As  the  study  of  such  minds  and  behaviors, 
psychology  should  focus  on  the  notion  of  adaptive  agency  - 
the  generation  of  action  in  the  world  m  response  to  chal¬ 
lenges  m  individual  fitness.  This  framework  encompasses 
many  approaches,  including  (1)  the  elucidaiion  of  complex 
species-typical  attapudons  (as  in  human  and  animal  experi¬ 
mental  psychology  and  cognidve  ethology),  (2)  the  com¬ 
parison  of  psychological  adapuidons  across  species  and 
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consideration  of  their  phylogenedc  origins  (as  in  compara- 
uve  psychology),  and  (3)  the  general  exploration  of  the 
adaptive  processes  themselves  that  yield  adaptive  agency 
(e.g.  by  simulation  methods,  including  those  in  the  Held  of 
artificial  life  -  see  Langton,  1989).  In  the  current  paper,  we 
consider  the  phenomenon  of  ‘leariiing’  as  an  aspect  of  a^p- 
uve  agency,  by  investigating  via  evolutionary  simulations 
some  conditions  under  which  the  ability  to  leam  may  prove 
adaptive  and  so  spread  through  a  population.  The  theoreti¬ 
cal,  historical,  and  methodological  ba^ground  for  this  work 
is  presented  more  extensively  in  hfilliv  and  Todd  (1990), 
and  further  extensions  of  this  method  applied  to  habituation 
and  sensitization  as  adaptations  to  short-term  environmental 
dynamics  appear  in  Todd  and  Miller  (in  press). 

Evolution  as  an  adaptive  process  has  itself  undergone 
changes:  "survival  of  the  stabie*  probably  preceded  "sur¬ 
vival  of  the  fittest"  (Daudtins,  197d).  Evolution  in  the 
earth’s  early  environment  is  likely  to  have  selected  for  repli¬ 
cating  systems  with  relative  stabili^  in  die  shifting  primor¬ 
dial  soup.  After  stability  came  reptioHion  and  metabolism: 
the  ability  to  turn  external  material  into  copies  and  exten¬ 
sions  of  oneself.  The  evolution  of  larger,  more  complex 
phenotypes  then  allowed  the  evohrtion  of  behavior- 
generating  systems  that  could  produce  bmaiely  programmed 
sequences  of  activity  and  movement.  Sensory  systems 
could  then  evolve  »  guide  these  behavior-generators  more 
adaptively,  based  on  senstivity  to  particular  environmental 
cues.  Thus,  blind  activity  may  have  preceded  reactivity  ~ 
the  ability  to  adaptively  adjust  to  the  cmient  changing  en¬ 
vironment  on  a  moment-by-moment  basi&  Only  after  these 
first  two  stages  had  evolved  could  a  further  adaptive  process 
evolve  -  'teaming,*  defmed  as  the  ability  to  make 
iong(ish)-term  adaptive  changes  in  behavior-generators  in 
response  to  the  environmenL'  In  this  timoteticai  framework, 
learning  emerges  not  as  the  primary  attaptive  force  some 
have  assumed  it  to  be,  but  rather  as  a  tertiiiy  one,  following 
genotypic  evolution  and  short-term  envirownental  reactivity 
(see  also  Shepard.  1987, 1988).  Once  we  re-conceptualize 
‘learning*  as  merely  one  process  among  several  that  gen¬ 
erate  adaptive  agency,  the  questions  we  might  ask  about  this 
process  begin  to  change  as  well.  , 


■  By  this  derinition,  learning  ineludee  such  processes  as 
experience-guided  development  net  commonly  uKluded  in  this 
caiegory.  For  examples  of  such  processea,  see  Knudsen,  1988: 
Singer.  1984, 1988;  and  Stein  et.  al..  1989. 
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Abstract 

• 

This  paper  describes  a  computer  simu¬ 
lation  of  an  animal  environment  which 
has  been  created  as  a  tool  for  investigat¬ 
ing  the  mechanisms  behind  ‘behavioural 
choice’  in  animals.  The  simulated  en¬ 
vironment  has  been  designed  to  pro¬ 
vide  sufficient  complexity  and  realism  for 
meaningful  behavioural  experiments. 

1  The  Problem 

“All  brains,  even  those  of  the  tiniest  insects,  generate 
and  control  behaviour.”  [Albus,  81].  This  basic  task 
of  animal  brains  can  be  split  into  three  subtasks  (e.g. 

[Brooks,  66])  as  shown  in  Fig.  1: 

1  -  Sensing  of  the  environment  so  as  to  be  able  to  perceive  Figure  1:  Three  functions  of  an  anunal  brain, 

what  is  going  on  at  each  moment  in  time  (perception). 

2  -  Taking  the  interpretation  of  the  environmental  situ¬ 
ation  and  using  it  to  decide  which  of  the  animal’s  reper^  ment  of  a  set  of  limbs  (e.g.  in  order  to  grasp  an  object, 

toire  of  behaviours  is  the  most  appropriate  (behavioural  navigate  around  an  obstacle,  etc).  We  want  to  look  at 

choice).  high-level  decision-making  (e.g.  should  the  animal  ob- 

3  -  TVansforming  the  chosen  behaviour  into  a  pattern  of  tain  food  from  the  nearby  fruit  bush  or  else  flee  from  the 

movements  of  parts  of  the  body  (motor  control).  predator  that  has  just  appeared  in  the  distance).  We 

do  not  want  to  address  the  problem  of  how  the  animal 
Our  aim  is  to  investigate  the  mechanisms  under-  should  move  its  limbs  in  order  to  pick  the  fruit  and  trans- 

lying  the  second  of  the  three  sUges  outlined.  This  sUge  fer  it  to  its  mouth  or  how  the  animal  should  best  move 

is  perhaps  the  least  well  understood  of  the  three  due  its  legs  so  as  to  be  able  to  run  away  from  the  predator, 

to  the  fact  that  the  processes  involved  are  internal  and 

cannot  be  directly  observed,  only  inferred  from  the  re-  In  short,  we  want  to  examine  the  second  of  the 

suiting  behaviour.  The  behavioural  parts  of  a  brain  do  three  functions  of  Fig.  1,  while  ignoring  the  other  two 
not  interact  directly  with  the  outside  world,  but  rather  as  much  as  possible, 
through  the  interfacing  systems  of  perception  and  motor 
control  [Halliday,  83). 

•  2  Why  a  Simulated  Environment? 

It  should  be  noted  that  the  uir  in  iliis  paper  of 

the  term  ‘behaviour’  (e.g.  in  ‘behavioural  choice')  refers  Given  that  we  want  to  examine  behavioural  choice  mech- 
to  a  pattern  of  actions  such  as  eating,  mating,  avoid-  anisms.  what  is  the  best  method  of  going  about  it? 
ing  predators,  etc.  This  should  be  distinguished  from 

the  usual  use  of  the  term  ‘behaviour’  in  tlie  .AI  liiera-  One  approach  to  examining  behaviour  has  in- 

ture,  where  it  more  often  denotes  the  coordinated  move-  volved  the  building  of  robots  which  can  navigate  in  a 
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ABSTRACT 

The  computer  metaphor,  based  on  rule  and 
symbol  manipulation,  is  challenged  by  our 
connecdonist  model  Neural  nets.  Parallel 
Distributed  Processing  systems,  etc.  are  cap-, 
able  to  achieve  goals  by  way  of  cooperative 
activity  and  can  ignore  all  gadgets  that  are 
required  in  the  computation^  models  based 
on  the  digital  computer  metaphor.  The  conse¬ 
quences  of  our  structure-oriented  modelling 
approach  for  causal  and  functional  explana¬ 
tions  in  ethology  are  discussed. 
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1,  INTRODUCTION 

Ethologists  use  the  modelling  approach  in  order  to 
reveal  the  mechanisms  underly^  overt  behaviour. 
Relevant  principles  of  organization  are  tested  by  em¬ 
bodying  them  in  a  mathematical  model  the  behaviour 
of  whi^  is  compared  to  experimental  data.  The  serial 
computer  metaphor  and  systems  theory  have  set  the 
fashion  for  some  time  in  modelUng  informadon  proces¬ 
sing,  such  as  involved  in  problems  of  deosion-making. 
Models  based  on  the  computer  metaphor  manipulate 
bits  of  data  in  a  fmmal  way,  according  to  preset  rules 
and  operadons  udiich  are  specified  in  programs  that 
were  designed  for  a  givea  task  (van  Rhijn  1977;  van 
Rhijn  &  Westerterp-PIandnga  1989;  Coderre  1989; 
Travers  1989).  This  way  of  modelling  involves  a  dis- 
dncdon  betwemi  system  hardware  and  software,  and 
postulates  a  central  processor  that  operates  on  the  data 
and  drives  the  system.  These  procedures  are  attracdve 
from  a  methodological  point  of  view,  in  that  one  knows 
the  complete  set  of  assumpdons  necessary  to  make 
these  constructs  work  as  desired.  But,  what  is  contri¬ 
buted  to  causal  analysis  when  we  apply  this  approach? 
In  practice,  one  post^tes  homunculi  with  the  self-same 
capaddes  that  the  theory  sets  out  to  explain.  Function¬ 
ally  defined  concepts  take  up  key  positions  in  the 


computer  model  but  these  concepts  lack  a  proper 
(albeit  potential)  backing  of  causal  mechanisms  that 
could  perform  the  assumed  tasks.  The  approach  of 
cybernetics  and  system  theory  (Toates  1986)  may  provide 
appropriate  solutions  when  dealing  with  simple  pheno¬ 
mena  (eg.,  orientation),  but  their  theoretical  framework 
is  not  suited  for  handling  more  complex  phenomena. 
Moreover,  many  models  based  on  this  conventional 
modelling  approach  happen  to  be  goal-directed. 

The  condition  of  an  ’e:q>licit  goal-representation’ 
makes  conventional  modelling  an  inappropriate  technique 
for  the  assessment  of  decisions  that  are  made  by.  our 
experimental  animals:  parasitic  wasps.  Many  parasitic 
wasps  are  known  to  adjust  the  offrpring  sex  ratio  to 
characteristics  of  the  environment,  to  marimira  repro¬ 
ductive  success.  They  are  capable  to  optinuze  the  FI 
ofispring  sex  ratio:  the  actual  ratio  sons  •  daughters  is 
such  that  the  maximum  number  of  gene  copies  will  be 
present  in  the  F2  progeny.  In  order  to  explain  the 
actual  decisions  of  our  wasps  (but  we  believe  that  this 
point  can  be  generalized,  see  McFarland  1989),  we 
require  ^ems  which  adapt  to  the  environment  without 
an  e:q>licit  representation  of  the  world,  and  which  can 
achieve  goals  without  a  representation  of  these  goals 
(le.,  without  goal  directedness).  Conventumal  ejqrlanatory 
concepts  are  therefore  of  no  use  vriien  aj^li^  to  the 
decisions  made  by  our  experimental 

Recent  developments  in  the  field  of  Artificial  Intel¬ 
ligence  have  expanded  the  possibilities  to  develop 
ethologicalfy  relevant  models:  systems  of  semi-auto¬ 
nomous  information-procesring  entities,  vriiose  local 
mteractions  with  one  another  are  controlled  by  a  set  of 
simple  rules.  Such  systems  do  not  contain  rules  for  their 
behaviour  at  the  global  level  The  drservable  behavioural 
output  and  its  complex  dynamics  are  emergent  proper¬ 
ties,  which  develop  from  the  local  interactions  of  the 
low-level  entities  (Langton  1989).  Connectionist  models 
perform  without  an  explicit  representation  of  a  goal  or 
an  emironment  This  kind  of  modelling  is  the  tool  for 
exorcising  the  homunculi  from  ethological  theory. 


2.  BIOLOGICAL  BACKGROUND 

Sex  allocation  theory  presents  one  of  the  finest  oppor- 
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Abstract 

We  cannot  expect  to  know  the  detailed  "wiring 
diagram"  of  the  nervoos  system  for  any  intelligent 
creature  for  quite  a  long  dme.  Even  then,  the  true 
organizatioo  is  likely  to  be  incredibly  complex  and 
tangled.  However,  in  order  to  build  intelligent  robots 
now,  we  need  a  plausible  interim  architecture.  A 
functional  model  for  robot  organization  is  proposed, 
starting  with  a  basic,  first  order  model,  which  is 
gradually  refined.  In  particular,  it  is  proposed  that 
associative  memory  provides  a  useful  —  and  perhaps 
plausible  -  basis  for  an  ioteUigeot  system. 

0 .  Introduction 

While  remarkable  progress  is  being  made  by  neuroscientists 
in  unraveling  portions  of  the  nervous  system  (see,  for 
example,  [Kosslyn  89]  or  [Halgten  87]  for  insights  into  the 
visual  system  memory  systems,  respectively),  we  are 
still  far  from  being  able  to  map  the  weUsprings  of  action, 
iiuentioii,  and  dedsioas.  Other  researchers  have  investigated 
abstract  models  of  adaptation  and  learning,  such  as  genetic 
algorithms  and  classifier  systems  [Holland  77],  or  the 
SOAR  system  [Newell  87];  abstract  models  have  bron  used 
to  build  explicit  models  of  creatures  (e.g.  the  Animat 
[Alison  87]).  [Drescher  89]  has  introduced  the  "schema 
mechanism,"  and  his  ideas  have  much  in  common  with  the 
proposals  below,  eqredally  in  bis  views  on  chaining,  and  in 
his  key  ideas  on  idutifying  and  learning  reliable  sdiemas, 
using  large  amounts  of  statistical  analysis.  "Subsumption 
architecture"  researchers  in  AI  (e.g.  [Brooks  86],  [Maes  90]) 
hope  to  arrive  at  intelligent  systems  by  fim  building  a 
(layered)  system  with  the  abilities  of,  say,  a  cockroach,  and 
ad^g  yet  mote  control  layers  to  eventuaUy  reach  greater  and 
greater  intelligence.  This  work  is  broadly  within  a  "Society 
of  Mind"-type  theory  that  views  intelligence  as  composed  of 
a  very  large  number  of  independent  agents  and 
"bureaucracies"  of  agents,  each  responsive  to  specific  sito> 
ations  or  patterns  [Minsky  87].  While  I  subscrilre  in  general 
to  the  S^iety  of  Mind  view,  I  believe  that  it  is  both 
possible  and  valuable  to  divide  up  the  model  of  mind  some¬ 
what  differently  than  is  done  within  subsumption 
architectures. 
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I  propose  here  a  model  of  a  robot's  "mind"  whose  com¬ 
ponents  are  divided  up  along  very  difierent  lines,  somewhat 
analogous  to  principal  components  analysis;  the  first 
component  is  a  general  associative  memory  model  that 
c^)tures  general  patterns  arxl  pritKiples  of  behavior,  while 
successiviT'compooents  add  refinements,  culminating  in 
society  of  mirxl-like  demons  that  recognize  very  specific 
situations  or  patterns,  and  then  override  (by  priming  or 
inhibiting)  more  general  behaviors.  Intermediate 
refinements  include  control  structures  that  allow  search  and 
chaining  of  actions,  as  well  as  rote  learning  and 
generalization.  Such  a  model  fits  neatly  on  any  massively 
parallel  computer  architecture  (e.g.  [Hillis  85]),  but  can  also 
be  simulated  on  serial  computers  (though  perhaps  rwt  fim 
enough  to  allow  real-time  performance,  ^  .c^  in  the 
simplest  of  environments). 

1.  Principle  One 

Use  associative  memory  as  the  overall  organizing 
conception. 

Basic  associative  memory  operations  can  capture  the  essence 
of  what  intelligent  entities  do:  selea  relevant  precedents  in 
any  situation,  and  act  on  them.  "Precederes”  can  be  actions, 
options,  remindings,  etc.  This  type  of  operation,  akin  to 
case-based  reasoning  (CBR)  [DA^A  88, 89]  and  memory- 
based  reasoning  (MBR)  [StanfiU  &  Waltz  86]  is  easily 
programmed  on  a  massively  parallel  machine,  and  has  found 
useful  applicatioas  [Waltz  90}.  Anumberoflechniqaescao 
be  used  to  find  "relevant"  items,  including  neaiest-nei^dtor 
algorithms,  and  majority  votes  of  o  nearest  neigUxns.^ 

If  only  a  single  precedent  is  close  to  the  cunett  situation 
(as  when  the  robot  is  operating  in  a  familiar  covifooment  on 
a  familiar  task),  then  little  more  than  an  associative  memory 
is  needed  in  order  to  act  intelligeittly.  Oidy  when  two  or 


*Thit  work  was  supported  in  part  by  the  Defense  Advanced 
Research  Projects  Agency.  adnUnistercd  by  the  U.S.  Air  Force 
Office  of  Scientific  Research  under  contract  #F49620^8-C- 
0038. 

^What  makes  a  neighbor  "near"  is  a  very  subtle  issue,  and  the 
key  open  problem  in  CBR  and  MBR 
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Abstract 

Habituation  is  a  basic  form  of  learning  in  which 
animals  come  to  respond  less  and  less  to  repeated 
presentation  of  a  given  stimulus.  Studies  of 
habituation  in  Aplysia  have  yielded  important 
insights  into  basic  mechanisms  of  synaptic 
plasticity,  while  studies  in  humans  reveal 
stimulus-specific  habituation  with  mutual 
dishabituation  by  pairs  of  different  stimuli. 
Studies  in  toads  suggest  a  new  phenomenon  which 
leads  us  to  new  models  of  vertebrate  learning 
which  are  subject  to  experimental  test.  Instead  of 
mutual  dishabituation  for  different  worm-like 
stimuli,  toads  exhibit  a  dishabituation 
hierarchy,  in  which  stimulus  A  may 
dishabituate  B,  but  not  vice  versa.  We  offer  a 
model  of  this  hierarchy  in  which  the  toad's 
visual  discrimination  is  reflected  in  different 
firing  rates  in  some  higher  visual  center, 
hypothetically  anterior  thalamus.  This  theory, 
developed  through  neural  simulation  based  on  an 
extensive  model  of  toad  retina,  predicts  that 
retinal  R2  ceils  play  a  primary  role  in  the 
discrimination  while  R3  cells  refine  the  feature 
analysis  by  inhibition. 

The  theory  predicts  new  dishabituation  hier¬ 
archies  based  on  reversing  stimulus-background 

^  Th  e  research  described  in  this  paper  was  suppxirted 
in  part  by  grant  no.  IROl  NS  24926  from  the  National 
Institutes  of  Health  (M. A. Arbib,  ^'•incipal 
Invcs’igator).  We  wish  to  express  our  sincere  thanks 
to  Prof.  J.-P.  Ewert  with  whom  one  of  us  (W.-D.L.) 
conducted  the  experiments  that  greatly  furthered  the 
modeling  reported  here. 


ground  contrast  and  shrinking  stimulus  size.  After 
the  predictions  were  made,  several  were  tested 
by  behavioral  experiments.  In  particular,  we 
selected  a  pair  of  stimuli  whose  ordering  in  the 
dishabituation  hierarchy  we  predict  to  be 
changed  by  contrast  reversal,  and  the 
experimental  result  is  as  predicted.  A  size 
shrinking  prediction  failed  to  be  validated,  and 
further  experiments  suggest  that  visual  prattern 
discrimination  in  toads  is  relatively  unaffected 
by  stimulus  size.  Finally,  we  discuss  new  insights 
into  a  network,  offered  as  a  prelimituiry  model  of 
the  medial  pallium,  that  can  express  the 
dishabituation  hierarchy  of  toads. 


1.  Background 

Habituation  is  an  elementary  form  of  learning  in 
which  response  to  a  stimulus  will  diminish  with 
repeated  presentation  of  the  stimulus  if  there  is  no 
punishment  or  reward  associated  with  the 
presentations.  In  the  marine  mollusc  Aplysia, 
habituation  has  been  much  studied  for  insight  into 
molecular  mechanisms  of  synaptic  plasticity  (see 
Bailey  and  Kandel,  1985,  for  a  review).  Here, 
habituation  seems  to  be  independent  of  the  specific 
patterning  of  stimuli,  whereas  habituation  in 
mammals  is  stimulus  specific.  Given  two  different 
patterns  A  and  B  (e.g.,  two  tones  of  different  pitch, 
volume,  or  duration),  the  animal  can  exhibit  this 
specificity  by  the  phenomenon  of  dishabituation 
(i.e.,  stimulus  B  can  release  behavior  despite 
habituation  to  A).  Moreover,  this  dishabituation  is 
mutual  in  mammals:  If  stimulus  A  can  dishabituaie 
stimulus  B,  then  stimulus  B  can  dishabituate  stimulus 
A  (Thompson  and  Spencer  1966;  Sokolov  1975). 
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Abstract 

Song  birds  learn  their  songs,  but  the  accuracy 
of  the  copying  process  varies.  As  a  result  the 
songs  present  in  an  area  change  with  time. 

Our  evidence  to  date  suggests  that 
various  aspects  of  chaffinch  song  in  the  wild 
can  be  accounted  for  by  a  simple  random  model 
in  which  individuals  learn  the  songs  that  they 
sing  from  various  adults.  The  distribution  of 
songs  between  repertoires  of  different  birds  is 
best  matched  by  simulations  with  a  1 5% 
copy-error  rate.  This  rate  of  error,  combined 
with  a  realistic  mortality  rate  of  40%.  also 
gives  a  good  approximation  to  the  changes  in 
the  song  types  present  in  a  population  with 
time. 

Simulations  have  been  used  to  examine 
the  distribution  of  song  types  between 
individuals  in  a  population.  When  the  simple 
situation,  where  all  birds  had  a  single  song 
type  and  four  neighbours,  from  one  of  whom 
new  birds  copied  their  song  was  examined, 
small  groups  of  birds  sharing  a  song  type  were 
found,  as  in  some  dialect  species.  An  extension 
of  this  approach  to  simulate  variations  in 
repertoire  size  or  in  numbers  of  neighbours 
has  recently  shown  that  both  these  factors  can 
have  a  strong  effect  on  the  sharing  of  song 
types  and  their  distribution  in  the  population. 
If  a  bird  chooses  the  commonest  song  type  sung 
by  its  neighbours  rather  than  one  of  them  at 
random  very  large  groups  of  birds  can  occur. 

'  These  simulations  suggest  that  the 
complex  distributions  of  song  types  often  found 
in  wild  bird  populations  may  result  simply 
from  random  copying  processes  which  are  not 
always  exact. 


1.  Introduction 

Learning  plays  a  role  in  the  song  development  of 
all  songbirds  studied  to  date  (see  review  by  Slater 
1989).  In  many  cases  the  copying  of  song  takes 
place  from  neighbours  when  young  birds  first  set 
up  their  territories  so  that  birds  on  adjacent 
territories  tend  to  share  songs,  while  the  songs  of 
those  further  apart  are  less  similar.  While  the 
learning  can  be  remarkably  accurate,  so  that  the 
songs  of  two  individuals  are  often  identical,  there 
is  good  evidence  from  a  number  of  studies  that 
inaccuracies  of  copying  may  lead  to  new  forms  of 
song  arising  (e.g.  Jenkins  1978,  Slater  &  Ince 
1979).  These  ’cultural  mutations’  may  be  the 
reason  why  the  songs  present  in  a  given  area 
change  with  time,  and  why  there  are  also 
differences  in  song  between  localities.  Much  of  the 
geographical  variation  in  song  is  complex, 
especially  in  cases  where  each  individual  has  a 
repertoire  of  several  different  song  types  and 
these  are  not  all  learnt  as  a  package  from  one 
other  bird.  However,  in  some  species  dialect 
areas  have  been  described  in  which  groups  of 
birds  share  the  same  song  type  or  types  and  are 
separated  from  each  other  by  more  or  less  sharp 
boundaries.  Whether  or  not  this  mosaic  pattern 
has  any  functional  significance  is  a  matter  of  a 
good  deal  of  controversy  (see  Baker  & 
Cunningham  1985). 

2.  Song  in  the  Chaffinch 

We  have  used  computer  simulation  mainly  to 
supplement  our  studies  of  song  distribution  in  the 
chaffinch  {Fringilla  coelebs),  a  small  European 
songbird.  Its  song  has  been  extensively  studied 
over  many  years,  starting  most  notably  with  the 
work  of  Marler  (1952)  on  song  in  the  wild  and 
the  laboratory  studies  of  Thorpe  (1958)  on  song 
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Abstract 

A  research  methodology  is  proposed  for  under¬ 
standing  intelligence  through  simulation  of  artifi¬ 
cial  animals  ("animats")  in  progressively  more 
challenging  envirotunents  while  retaining  charac¬ 
teristics  of  holism,  pragmatism,  perception,  cate¬ 
gorization,  and  adaptation  that  are  often 
underrepresented  in  standard  AI  approaches  to  in¬ 
telligence.  It  is  suggested  that  basic  elements  of  the 
methodology  should  include  a  theory/taxonomy 
of  envirorunents  by  which  th^  can  be  ordered  in 
difficulty — one  is  offered — and  a  theory  of  animat 
efficiency.  It  is  also  suggested  that  the  methodolo¬ 
gy  offers  a  new  approach  to  the  problem  of  percep¬ 
tion. 

1.  Introduction 

There  are  two  broad  approaches  to  the  scientific  under¬ 
standing  of  intelligence,  or  how  mind  arises  from  brain. 
One  is  the  natural  science  approach,  analyzing  and  ex¬ 
perimenting  with  phenomena  of  life,  mind,  and  intelli¬ 
gence  as  they  exist  in  ruture.  In  this  there  are  two  main 
branches:  physiology  and  especially  neurophysiology, 
in  which  living  systems  are  subject  to  detailed  internal 
investigation;  and  experimental  psychology,  including 
studies  of  animals,  in  which  living  systems  are  studied 
through  their  external  behavior.  Related  to  the  latter, 
but  more  observational,  are  fields  such  as  linguistics 
and  anthropology. 

In  contrast,  the  second  broad  approach  to  intelligence 
may  be  termed  synthetic  and  computational,  in  which 
the  objects  studi^  are  constructed  imitations  of  living 
systems  or  their  behavior.  In  "Computing  machinery 
and  intelligence",  Turing  (1950)  suggested  two  possible 
directions  for  the  computational  approach: 

We  may  hope  that  machines  will  eventually 
compete  with  men  in  all  purely  intellectual 
fields.  But  which  are  the  best  ones  to  start  with? 
Even  this  is  a  difficult  decision.  Many  pxxiple 
think  that  a  very  abstract  activity,  like  the  play¬ 
ing  of  chess,  would  be  best.  It  can  also  be  main¬ 
tained  that  it  is  best  to  provide  the  machine 


with  the  best  sense  oigans  that  money  can  buy, 
and  then  teach  it  to  understand  and  speak  En¬ 
glish.  This  process  could  follow  the  normal 
teaching  of  a  child.  Things  would  be  pointed 
out  and  named,  etc. 

Turing's  first  projposed  direction  led  to  "standard  AI" 
or  computational  cognitive  science.  Standard  AI  is  ba¬ 
sically  competence-oriented,  modelling  spedfic  human 
abilities,  often  quite  advanced  ones.  However,  while 
many  AI  programs  exhibit  impressive  performance, 
their  relevance  for  the  understanding  of  natural  intelli¬ 
gence  is,  in  several  respects,  limited. 

In  addressing  isolated  competences,  AI  systems  typ¬ 
ically  ignore  the  fact  that  real  creatures  are  always  situ¬ 
ated  in  sensory  environments  and  experience  varying 
degrees  of  need  satisfaction.  Furthermore,  the  systems 
attach  less  importance  to  such  basic  natural  abilities  as 
perception,  categorization,  and  adaptation  than  they  do 
to  algorithmic  processes  like  search  and  exact  reason¬ 
ing.  This  leads  eventually  to  problems  connecting  the 
arbitrary  symbols  used  in  interruil  reasoning  with  exter¬ 
nal  physical  stimuli  ("symbol  grounding'  (Hamad, 
1990)),  and  "brittleness"  (Holland,  1986),  the  tendency 
for  AI  systems  to  fail  utterly  in  domains  that  differ  even 
slightly  from  the  domain  for  which  they  were  pro- 
granuned. 

AI  systems  also  have  an  arbitrariness:  it  is  often  not 
clear  why  one  program  that  exhibits  a  certain  intellectu¬ 
al  competence  is  to  be  preferred  over  some  other  one  ex¬ 
hibiting  the  same  competence,  especially  since  the  field 
has  not  agreed  on— or  too  much  sought — a  clear  defini¬ 
tion  of  intelligence.  In  a  sense,  the  programmer's  facil¬ 
ity  for  imitating  a  high-level  fragment  of  human 
competence  is  a  kind  of  trap,  since  from  a  natural  sci¬ 
ence  perspective  there  is  usually  no  strong  relation  to 
nature. 

Turing' s  second  proposal,  for  a  "child  machine",  re¬ 
ceived,  over  forty  years,  little  attention  or  resources, 
perhapw  because  it  seemed  fantastic.  Yet  the  child  ma¬ 
chine  was  to  be  situated  from  the  start  in  a  real  sensory 
environment  and  was  to  learn  through  experience.  It 
would  have  emphasized  precisely  the  abilities  that 
standard  AI  minimized.  Turing's  proposal  is  in  fact 
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Abstract 

This  paper  describes  an  approach  to  evolution 
upon  an  artificial  insect  existing  in  a  simulated 
two-dimensional  world,  comprised  of  a  "spread¬ 
ing-activation”  neural  network.  Study  of  the  diffi¬ 
culty  of  hand-building  neural  networks  has  re¬ 
sulted  in  a  genetic  expression,  based  on  the  notion 
of  the  Von  Neumann  computer  architecture  and 
the  biological  principles  of  DNA,  which  incorpo¬ 
rates  both  operations  and  data  in  a  simulated 
DNA  strand.  Work  in  progress  indicates  that  by 
embodying  behavior  parameters  and  actual  neu¬ 
ral  connections  in  a  genotypic  "language,”  and  by 
expressing  that  phenotypically  as  an  animat,  the 
computer  simulation  appears  able  to  evolve  a 
better  species  of  animat  through  mutations  upon 
the  genot3rpe. 

Introduction 

A  domain  of  "artificial  life”  explores  limited  com¬ 
puter  simulations  of  animal  behavior  in  an  artificial 
environment  This  class  of 
simulated  animal  is  com¬ 
ing  to  be  known  as  the  ani- 
mat,  as  first  coined  by  Ste¬ 
wart  Wilson  in  [Wilson, 

1985].  Research  on  the 
notion  of  an  artificial  insect 
has  proceeded  in  many 
diverse  headings  (see  for 
instance  [Travers,  1988], 

[Maes,  1990],  [Park,  1988], 

[Wilson,  1987]).  The  im¬ 
plementation  described 
here  is  a  direct  extension  of 
Jack  Park's  animat  as  de¬ 
scribed  in  [Park,  1988]. 


'The  existing  animat  system  upon  which  this  pro¬ 
gram  has  extended  utilizes  a  spreading-activation 
[Collins  and  Loftus,  1975,  Anderson,  1983]  neural 
network  which  was  originally  conceived  as  an  exercise 
to  ascertain  the  capabilities  of  a  neural  net  in  control¬ 
ling  some  sort  of  process.  This  neural  implementation 
has  been  applied  not  only  to  the  animat  described 
here,  but  to  process  control  in  the  manufacturing 
domain  [Park,  et  al.,  in  prep,],  part  of  a  scientific  dis¬ 
covery  system  [Wood  and  Park,  in  press],  and  to  the 
study  of  the  Piagetian  development  of  an  infant  brain 
[Wo^,  manuscript]. 

The  "wetware”  implementation  controlling  this  ani¬ 
mat  consists  of  a  few  dozen  "neuron”  nodes,  each 
connected  to  several  other  nodes  with  varying  levels  of 
positive  (excitory)  or  negative  (inhibitory)  strength. 
Sensors  (input  of  sight,  pain,  hunger,  taste,  and  satia¬ 
tion)  stimulate  neurons  which  in  turn  dissipate  acti¬ 
vation  to  other  neurons;  muscle  neurons  that  have 
reached  a  certain  threshold  of  activation  will  cause 
appropriate  routines  to  execute  (to  cause  the  bug  to 
move,  turn,  eat,  or  move  randomly).  Additi;>nally,  the 
system  applies  a  constant  decay,  to  keep  the  overall 
activation  steady,  and  random  noise  injected  into  the 
activation  levels  of  all  neurons.  The  system  does  not 
employ  learning  methods  during  the  ’lifetime”  of  the 
animat;  this  project  studies  only  the  non-plastic  neu¬ 
rons  found  in  the  lower  animals. 

The  animat’s  environment,  implemented  on  a 
Macintosh  II,  is  a  bound  arena  (figure  1)  with  morsels 


Figura  1 
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Abstract 

We  introduce  the  notions  of  Raw  and  Full 
Cognitive  Maps  and  Absolute  Space 
Representations  (ASRs)  and  discuss  (i)  the 
value  of  computing  a  representation  of  the 
local  environment  (ASR),  (ii)  the  need  for  a 
global  representation  of  one’s  immediate 
surroundings,  (iii)  categorisation  and  the 
formation  of  concepts  and  (iv)  the  problem  of 
planning  a  route. 

0.  Introduction 

The  problem  of  computing  a  cognitive  map  is  fundamental  to 
any  autonomous  mobile  system,  be  it  a  rat,  a  human  or  a  robot 
When  Tolman  (1948)  Hrst  suggested  the  idea  of  a  cognitive 
map,  he  was  probably  referring  to  a  “map”  of  the  spatial 
layout  of  the  environment  (mazes  in  his  case)  but  later  it 
became  clear,  especially  after  Lynch's  (1960)  work,  that  the 
notion  of  a  cognitive  map  is  a  complex  one.  In  the  early  70’s. 
there  was  an  outcry  from  geographers,  urban  planners  and 
desipers  that  a  cognidve  map  is  not  a  map  (see  [Downs  and 
Stea,  1973]).  A  cognitive  map  is  tied  to  our  spatial  behaviour 
and  it  is  therefore  affected  by  a  wide  variety  of  factors  ranging 
from  our  mode  of  travel  and  past  experiences  to  our 
preferences  and  attitude.  One  problem  with  this  view  is  that 
it  leads  to  a  confusing  use  of  the  term  and  one  is  often  left  with 
the  impression  that  a  copitive  map  holds  one’s  entire 
knowledge.  However,  if  the  system  is  to  adapt  and  survive  in 
a  (hostile)  world  with  other  agents  in  it.  such  factors  must  be 
considered. 

Cognitive  mapping  is  therefore  a  complex  process  which 
involves  both  one’s  perception  and  concepuon  of  the  outside 
world.  Studies  which  emphasised  only  one  level,  either  the 
perceptual  (e.g.  work  on  autonomous  mobile  robots)  or  the 
conceptual  (e.g.  early  models),  were  at  best  incomplete  and 
very  often  asked  many  questions  inappropnate  at  that  level 
For  exa.mple,  robotics  researchers  were  concerned  with  how 
to  partition  the  environment  in  terms  of  spaces  large  enough 
for  the  robots  to  pbn  a  collision  free  path,  but  planning  a 
collision  free  path  is  a  local  problem  and  an  attempt  to  solve 
it  at  the  path  planning  level  is  inappropriate  (for  more  detail, 
see  [Yeap  et.  al.,  1990]). 


Our  past  work  has  been  the  development  of  a 
computational  theory  of  cognitive  maps  to  explain  what 
needs  to  be  computed  and  why  [Yeap,  1988, 1990;  Yeap  and 
Robertson  1990;  Yeap  et  al.,  1990].  We  stress  the  importance 
of  studying  the  process  as  a  whole,  from  perception  to 
cognition  and  generally  in  that  order.  Our  investigation  of  the 
process  begins  with  Marr’s  (1982)  computational  theory  of 
vision  and,  as  in  Marr’s  work,  the  notion  of  a  representation 
is  central  to  our  study.  The  next  section  presents  a  brief 
overview  of  the  theory,  the  main  idea  being  that  a  cognitive 
mapping  process  should  first  compute  a  raw  map  and  then  a 
full  map.  Using  this  theory  we  discuss  four  important  issues 
in  cognitive  mapping,  two  at  each  level:  (i)  the  sigitificance  of 
computing  a  representation  of  the  local  environment  (an 
Absolute  Space  Representation  or  ASR)  and  (ii)  the  need  for 
a  global  representation  of  one’s  immediate  surroundings;  (ui) 
categorisation  and  concept  formation  and  (iv)  the  problem  of 
planning  a  route.  These  issues  arise  from  the  insights  gained 
from  the  implementation  of  the  dieory  and  from  further 
consideration  of  the  nature  of  the  cognitive  mapping  process. 

1.  A  Computational  Theory  of  Cognitive  Maps 

Although  there  are  many  factors  which  influence  our 
conceptual  view  of  the  world,  our  representation  of  the  world 
musttegin  from  what  we  perceive.  This  observation  suggests 
that  the  first  step  in  a  cognitive  mapping  process  is  to  compute 
a  representation  of  the  physical  environment.  Since  the 
different  conceptual  views  of  the  world  are  but  different  ways 
of  looking  at  what  is  already  computed,  this  representation  of 
the  physical  environment  should  be  fairly  independent  of  the 
conceptual  representations  that  are  developed  later.  Our 
theory  therefore  suggests  that  a  cognitive  mapping  process 
should  be  studied  as  a  process  consisting  of  two  loosely- 
coupled  modules.  An  early  cognitive  mapping  process 
computes  a  representation  of  the  physical  world  as  perceived 
by  our  senses.  We  call  this  representation  a  Raw  Cognitive 
Map;  indicating  that  the  computed  representation  is  not 
interpreted.  A  later  cognitive  mapping  process  computes  a 
representation  of  the  conceptual  world.  We  call  this 
representation  a  Full  Cognitive  Map;  the  word  “full” 
indicates  the  full  richness  of  the  map  as  a  cognitive 
representation. 
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